
t - Indian AaRicuiiTURAL 

Research Institute, New Delhi. 


I. A. R. t. 6. 

,MGIPC— Sl-4 AR/Ot— 7-7-64— 10,000. ... 




SUPPLEMENT TO THE JOURNAL 

OF THE 


ROYAL 

STATISTICAL SOCIETY. 


Founded 1834 

Incorporated by Royal Charter 1887. 


VOL. VIII.— No. 1, 1946. 


• • 

^ 

LONDON : 

THE ROYAL STATISTICAL SOCIETY, 
4, PORTUGAL STREET, W.C.2 
(1946.) 



CONTENTS 

VOL. vni — NO. I, 1946. 


PACES 

Sequential Tests in Industrial Statistics. By G. A. Barnard, M.A • 1-21 

Discussion on the Paper 22-26 

Symposium on Autocorrelation in Time Series: 

On the Theoretical Specification of Sampling Properties of Autocorrelated 
Time Series. By M. S. Bartlett 27-41 

Some Instruments for the Analysis of Time Scries and their Application to 
Textile Research. By G. A. R. Fosier 42-61 

Random Processes in Problems of Air Warfare. By L. B. C. Cunningham 
and W. R. B. Hynd 62-85 

Discussion on the Papers 85-97 

Sequential Sampling FormuliC for a Binomial Population. By J. P. Burman, B.A. 98-103 

Some Properties of Closed Sequential Schemes. By C. M. Stockman and P. 

Armitage 104-112 

A Modified Probit Technique for Small Probabilities. By M. S. Bartlett 113-117 

The Analysis of a Scries of Experiments by the Use of Punched Cards. By O. 
Kempthorni: 118-127 

The Statistical Analysis of Variance-Heterogeneity and the Logarithmic Trans- 
formation. By M. S. Bartlett and D. G. Kendall 128-13?f; 



SUPPLEMENT TO THE JOURNAL 


OF THE 


ROYAL 

STATISTICAL SOCIETY. 


o Founded 1834 
Incorporated by Royal Charter 1887. 


VOL. VIIL— No. 2, 1946. 


LONDON : 

THE ROYAL STATISTICAL SOCIETY. 
4, PORTUGAL STREET, W.C.2 
(1946.) 



CONTENTS 

VOL. viir.— -PART ir, 1946. 


Statistical Methods in the Selection of Navy and Army Personnel. By P. E. Vernon 139-148 
Discussion on the Paper 148-153 

The Application of Some Commercial Calculating Machines to certain Statistical 

Calculations. By H. O. Hartley 154-172 

Discussion on the Paper 173-183 

The “ Effective Number of Independent Observations in an Auto-correlated Time 

Series. By G. V. Bayley and J. M. Hammersley 184-197 

Average Sampling Numbers from Finite Lots. By S. Vajda 198-201 

The Use of the Negative Binomial Distribution in an Industrial Sampling Problem. By 

M. E. Wise 202-211 

A Table of Lagrangian Coefficients for Logarithmic Interpolation of Standard Statistical 

Tables to obtain other Probability Levels. By J. T. Richardson 212-215 

Linear Sequential Rectifying Inspection for controlling Fraction Defective. By F. J. 

Anscombe 216-222 

On the Distribution of the Sum of n Sample Values drawn from a Truncated Normal 

Population. By V. J. Francis 223-232 

Statistical Control applied to High Duty Iron Production. By E. W. Harding 233-243 * 

Ultimate Risks in Sampling Inspection. By A. H. R. Grimsey 244-250 


CONTENTS OF LAST ISSUE 
VOL. VIII. — PART I, 1946. 

PACKS 

Sequential Tests in Industrial Statistics. By G. A. Barnard. (With Discussion) 1-26 

Symposium on Auto-Correlation in Time Series : 

On the Theoretical Specification of Sampling Properties of Auto-Correlated Time 

Series. By M. S. Bartlett 27-41 

Some Instruments for the Analysis of Time Series and their Application to Tactile 

Research. By G. A. R. Foster 42-61 

Random Processes in Problems of Air Warfare. By L. B. C. Cunningham ^and 

W. R. B. Hynd 62-85 

Discussion on the Papers 85-97 

Sequential Sampling Formulae for a Binomial Population. By J. P. Burman 98-103 

Some Properties of Closed Sequential Schemes. By C. M. Stockman and P. Armitage 104-112 

A Modified Probit Technique for small Probabilities. By M. S. Bartlett 1 13-117 

The Analysis of a Series of Experiments by the Use of Punched Cards. By O. Kemp- 

THORNE 118-127 

The Statistical Analysis of Variance-Heterogeneity and the Logarithmic Transformation. 

By M. S. Bartlett and D. G. Kendall 128--138 



SUPPLEMENT TO THE 


Journal of the Royal Statistical Society 

Vol. VIII, No. 1, 1946 


Proceedincis of the First Meeting of Thf Research Section of The Royal Statistic al Society, 

Session 1945^6. Held on Tuesday, December 4th, 1945, Dr. J. Wishart in ihe Chair. 

The Chairman, in opening the meeting, said that this was the inaugural meeting of a new Section. 
The former Industrial and Agricultural Research Section was founded in 1933, and held its last 
meeting in May 1939. On the emtbreak of war it was clear that those who would be likely to con- 
tribute to its proceedings would be heavily engaged in war-time activities, and with some regret 
the proceedings were suspended. At the same time the Supplement to the Society's Journal, which 
had been the organ of the Industrial and Agricultural Research Section for the publication of its 
proceedings, was also suspended. 

It had now been decided to form two Sections. The Research Section, of which this was the 
first meeting, would present papers on the theory of statistics and statistical methods, or on the 
development of new applications, and, in addition, an Industrial Applications Section had been 
formed which it was intended should operate on a regional basis in a number of groups, and would 
continue over a wider geographical field the scries of discussions which had been taking place in 
London during the war. The publication of the Supplement would also be resumed. This would 
normally contain the papers and discussions at the Research Section meetings, of which it was 
hoped to have four in each session, but it would also be a medium for publishing general papers 
on statistics not read at the meetings, and it would contain selected papers and discussions coming 
from the Industrial Applications Section. It was fitting that at this inaugural meeting the subject 
chosen should be one which arose among groups of workers dealing with war problems on both 
sides of the Atlantic, and should be presented by a member of one of these groups. 

The following paper was then read : 

Sequential Tlsts in Industrial Statistics 
By G. A. Barnard, M.A. 

Summary 

After an introductory and an historical note, an elementary problem of simple qualitative 
inspection of a box of components is treated by using a “ lattice diagram representation." This 
leads to the consideration of sequential tests for such cases. Procedures for determining 
“ Target-Handicap ” forms of inspection, and their operating and sample size properties are given. 

This leads to a consideration of general linear sequential tests, which are those test procedures 
which can be formulated in terms of a " score." Such procedures are shown to be similar to 
classical games of chance, and to physical diffusion processes. The diffusion analogy leads to a 
differential equation which gives tjie approximate characteristics of any such linear test. 

In many cases, Wald’s " Probability Ratio Sequential Test " takes the form of a linear test. 
The conditions for this are determined. The P.R.S. test is seen to be " best possible linear test," 
in the sense of minimizing average sample size. The effects of deviations from normality, and 
general distributions are considered. 

Reference is made to Wald’s work on tests which involve parameters other than those being 
estimated, and then consideration is restricted to tests for the mean of normal populations where 
the variance is unknown. Methods of reducing such tests to simple binomial tests are indicated. 

^ A number of procedures for use with 2x2 comparative trials, and double dichotomies, are 
given, and their properties discussed. 

Returning to general inspection problems, the paper indicates that these are not always to be 
SUPP. VOL. VIII. NO. 1 B 



2 


[No. 1. 


Barnard — Sequential Tests in Industrial Statistics 

identified with problems involving merely tests of statistical hypotheses. The notions of Con- 
sumer's l.ot. Producer's Batch, the Lot Quality Curve, the Process Curve, are explained, and 
their importance indicated. A distinction is made between Acceptance [ns^ction schernes 
and Rectifying Inspection schemes, and the notions of Operating Characteristic Curve, Operating 
Characteristic Matrix, and the Sample Size distribution function are explained. 

The lattice diagram is used to bring out relationships between notions involved in general 
inspection, and some other uses are also indicated. 

Finally, some reflections on the relevance of the matters discussed to matters of current debate 
among statisticians are given. 


CoNTLNTS 

Introduction and Flistorical Note. 

Part I : A Simple Inspection Problem- the inspection diagram. 
Part II: Ciencral Sequential Tests. 

Part III: General Inspection Problems. 

Conclusion. 


Introduction 


Thl distinction between sequential tests and what may be called classical tests is illustrated by 
various games of skill. 

In football, and in chess championship matches, the duration of play is fixed in advance. In 
football, the game goes on for an hour and a half; in a chess championship the match goes on for 
twenty-one games. In tennis, on the other hand, the duration of play is not fixed in advance. 
A certain minimum number of points must be played, but after that the game ends whenever 
one player has gained at least two points over his opponent. And in bridge, the game goes to 
the pair who first score loo points below the line. 

Football and chess championships correspond to classical test procedures; tennis and bridge 
correspond to sequential test procedures. In classical procedures the number of observations to 
be made is fixed in advance of the experiment; in sequential procedures the results of the observa- 
tions themselves are allowed to influence the number of observations made. 

Sequential procedures arc not always practicable. If we arc experimenting with the growth 
of trees, for example, it may take years for a tree to grow to maturity; in such a case it would 
be absurd, in our present system of society, to grow one tree first, and see what happened, and then 
to plant another, and so on. On the other hand, if we are examining boxes of mass-produced 
components, we may in any case take components from the boxes one by one, or two by two. 
With the trees, a sequential procedure would be inapplicable; with the components, provided 
w'e can take them in random order, a sequential procedure would be not merely applicable, but 
in fact highly desirable. 

When they arc applicable, sequential procedures arc desirable because they enable us to reduce 
the average amount of experimental labour involved in a test of given precision, very often by as 
much as 50 per cent. 

Sequential procedures save time in two ways. First, they make us stop work just as soon 
as we have enough evidence on the points in question. And, second, they enable us to use judg- 
ment in arranging the experiment, and to use any plausible guesses we can make about the true 
results to minimize the work involved; these guesses may be wrong, but if they are, they affect 
only the amount of work we do, not the validity of our results. 


Historical note 

Wald's paper ' contains a historical note on the origins of sequential ideas, and so my remarks 
need be only supplementary. As Wald points out, many authors have used sequential ideas in 
particular cases, without apparently realizing the general implications. His work, done in April 
1943, seems to have been the first general attack, while mine was started in June of that year. 
Some of OUT results were published in reports whose circulation was restricted during the war. 
Beginning independently, we afterwards tried to keep in step with the American work, but we 
succeeded only partly, as was natural. In the following account many of the results overlap with^ 
Wald's ; but the approach and methods used are different, in most cases. 

Many people will find them.selves in the position of M. Jourdain — they have, in a sense, been 
applying sequential methods all the time without knowing it. Engineering inspectors, untutored 
in the theory of statistical hypotheses, often judge* batches of components by taking first a small 



1946] Barnard — Sequential Tests in Industrial Statistics 3 

sample, and then another if the first is not decisive. And scientists often have their doubts about 
some theory gradually removed by a steady accumulation of evidence. Sequential theory only 
makes precise these common-sense procedures. 

In particular. Dr. Case seems to have been using a sequential method for estimating a 
probability p for some time past, and a mathematical theory of his method has been worked out 
by Haldane.- 

[t should also be pointed out that many of the problems of mathematical probability which 
are involved in sequential procedures are of enormously long standing. The “ Problem of Points,” 
considered by Fermat and Pascal, is closely related to our work, and later historical references 
are so numerous that they are perhaps best covered by a reference to Isaac Todhunter's book,^ 
which forms a good Baedeker for the early literature. Since Todhuntcr, too, many problems in 
physics have led to work which can be taken over into the theory of sequential tests. Mr. Tweedie’s 
work * on ” inverse variates ” seems to have arisen in such a connection. And Khintchine's 
book ^ should also be 'mentioned. 

/. A simple inspection prohicm—the inspection diagram 

Suppose we inspect a box of components which we can classify simply into ” effectives ” and 
“ defectives." We are interested in the fraction defective, p — the proportion of defective com- 
ponents to the total number. We assume that the number of components in the box is large 



compared with the number of components we actually examine, and that our sampling procedure 
is random, so that the probability that each component examined will be defective is /?, and the 
probability that it will be effective is ^ I - p. 

We also assume that we are operating an “ Acceptance Inspection Plan " — that is, an inspection 
plan in which we confine our activities to acceptance or rejection of the batch, and we do not take 
into account any possibility of improving the batch quality by replacing defective items by effective 
items. With this assumption, the properties of our plan in relation to the single batch we are 
now considering arc specified by the Operating Characteristic Curve (O.C. curve), which gives the 
probability Dip) of accepting the batch, as a Inunction of p. 

Any sensible inspection plan will always accept perfectly good batches ip 0) and always 
reject perfectly bad batches (p — 1). In general our O.C. curve will be S-shaped (Fig. 1), 
with Oip) decreasing as p increases from o to i. And as a matter of pfactical experience it is 
found that most O.C. curves are sufficiently well represented by a straight line, if drawn on 
logarithmic probability paper using the logarithmic scale for p. Consequently, if we determine 
values />!, /?-», such that 

0(/7i) - 100/101 and Of/?,)- 1/101 

we can obtain the rest of the O.C. curve, between these two points, by plotting on log probability 
paper and joining by a straight line. Although the approximation breaks down when p lies 




4 


[No. 1, 


Barnard — Sequential Tests in Industrial Statistics 

outside the interval Pi < p < /> 2 , we can, for practical purposes, regard the O.C. curve as being 
determined by these two values, Pi, Pa, called respectively the Producer's loo to i Safe Point 
and the Producer's loo to t Risk Point. 

ff the Risk and Safe points are fixed, the O.C. curve is practically fixed, and the problem of 
designing a good acceptance inspection scheme reduces to that of finding one scheme, out of all 
those “ equivalent ” schemes having the same Risk and Safe points, which will minimize inspection 
costs. And in this section we assume that inspection costs arc measured solely by the number 
of items to be inspected, so that minimizing costs means minimizing the expected number of 
observations we make. 

The inspection dia^cram 

If we can inspect our components one by one, we can represent our results by the “ random 
walk " diagram, or the “ inspection diagram," as 1 shall call it (Fig. 2). Taking axes in the plane 



Fici. 2. — Inspection Diagram. 

and starting at the origin, we move unit distance to the right each time we find an eft'ective item 
and unit distance upwards each time we find a defective item. Then the inspection process is 
represented by a zig-zag line OP. The co-ordinates (.y, y) of P give the summary result of the 
inspection up to this stage — we have found x effectives and^ y defectives in the total number 
(a' 4- y) inspected so far. As inspection proceeds, P moves outwards from the origin. 

The adoption of any particular inspection plan will correspond to drawing two lines on the 
inspection diagram — the Acceptance Boundary and the Rejection Boundary. As soon as P 
reaches a point on the Acceptance Boundary, the batch is accepted, while as soon as P reaches 
the Rejection Boundary, the batch is rejected. The decision on the batch depends on which of 
these two possibilities occurs first. And the determination of the inspection plan reduces to the 
determination of the equations 

y - A(x\ y - R(x) (1) 

to the acceptance and rejection boundaries. The A and R boundaries, taken together, are called 
the inspection boundary. 

Fundamental properties of the inspection diagram 

One of the most useful features of the inspection diagram is that it enables us easily to 
distinguish between possibilities which are mutually exclusive, and possibilities which are not 



1946] Barnard — Sequential Tests in Industrial Statistics 5 

mutually exclusive. Thus the calculation of the various probabilities associated with an inspection 
scheme is simplified. 

In particular, the points on the inspection boundary represent mutually exclusive possibilities, 
while those points on the line 

.Y I- y ^ n 

which lie between the A and R boundaries represent mutually exclusive and exhaustive possibilities 
at the nXh stage of inspection, when the sample size is n, 

A second feature of the inspection diagram is based on the fact that our assumption that the 
order of sampling is random is equivalent to the assumption that ail paths from the origfn O to 
any point P are equally likely — provided, of course, that the paths always move upwards or to the 
right, in accordance with the rules, and provided also that they are “ permissible " paths, which 
do not cross the inspection boundary. Consequently the total probability, Tpr{X), of reaching 
any point X within the inspection boundary is given by 

Tpr{X) M'iOX) . P(X) (2) 

where N\OX) number of distinct permissible paths from O to 

and PiX) -- probability of reaching X from O by any one path. 

Suppose, for example, that we have a large batch of fraction defective /?, and we operate a 
“ Single Sample Scheme,” involving a single sample of lo items, and we reject if there are 3 or 
more defectives in the sample. Then our inspection boundary will be as shown in Fig. 2, and the 
equations (1) will be 

y >l(.v) = 10 -- .Y (8 A' <. 10) 

y Rix) = 10 ~ X (0<.Y< 7) 

and 0(p) - A(O,(I0.0) . -j yV(0,(9,|)) . q^p (- 7V(0,(8,2)) . qY" 

where NiOX) represents the total number of paths (permissible or not) from O to X. In our 
case, for example, 

^(0,(8, 2)) - /V'(0,(8,2)) - ^142^ 

and we get the usual Bernouilli formula for 0(p). 

A third feature of the inspection diagram emerges when we examine the “ Single Sample 
Scheme ” depicted on Fig. 2. Because, from the figure, it is apparent that as soon as P crosses 
the line RS we can be sure the batch will be rejected; and as soon as P crosses the line /4B we can 
be sure it will be accepted. So we can replace the rejection boundary by the line RS, and the 
acceptance boundary by the line AB, without altering the O.C. curve. And it pays to do this 
because the sample sizes associated with points on RS and AB are smaller than* the sample sizes 
associated with the points on the original inspection boundary. 

Similar considerations will apply to any proposed inspection boundary. It can always be 
“ contracted,” without affecting the O.C. curve, so that finally the functions A(x), R{x) are each 
non-decreasing functions of x, and this procedure will always reduce the mean sample size; unless, 
of course, the boundary is already in the ” contracted ” form. We can therefore enunciate the 
principle : 

Any economical inspection scheme will have A and boundaries such that A{x) and R{x) 
are non-decreasing functions of AT (3) 

But the limitation implied by this principle does not enable us to determine an inspection 
scheme, when the Risk and Safe Points are given. There will be many “ contracted ” schemes, 
all having the same Risk and Safe Points. And, although in the example just given we could 
calculate 0{p) by performing the reverse operation to ” contraction,” and so apply the Bernouilli 
formula, this will not in general be possible. 

In general, the calculation of 0(p) for a contracted scheme is very difficult. If we neglect the 
presence of the R boundary, and if A, ^ {x,,r) is the first point on the A boqndary for which 
y = r, then the formula * 

N'iOA,) = r - J ‘ (r - j) ■ N'iOA^) . . 

* Found simultaneously by myself and Mr. James Godwin. 


( 4 ) 



6 


Barnard — Sequential Tests in Industrial Statistics [No. 1, 

enables us to calculate the acceptance properties of such a scheme. But when and )v ^tnd r 
are large, the formula becomes unmanageable. Other methods exist, as we shall see later, but 
none of them is as simple as Wald's method, to which wc now turn. 


WaUTs Binomial Test 

With the assumptions stated at the beginning of this section, if Tpr(A,p)^ TpriR.p) denote, 
respectively, the total probabilities of acceptance and rejection for the boundaries A and B, 

Tpr{A,p) ^\N'{OP) . q'p") (5) 

• A 


where P - {x,y) runs over the A boundary, and, similarly, 

TpriR.p) ^\N\OP) .q^p^\ (5') 

n 

where P runs over the R boundary. 

Wald's method consists, not in determining Tpr{A,p), Tpr{R,p), for given A and R boundaries, 
but rather in determining the A and R boundaries so as to obtain prescribed values for Tpr{A,p) 
and Tpr(R,p). In fact, if /?, and p., are given, and we determine the A boundary so that all 
along it 

UTxP\tq'iP^A " (1 -- const (6a) 

and the R boundaiy so that all along it 


then we shall have 


and similarly 


^'(1 const. 

Tpr(A.Pi) :^{N'{OP).q\,p>'.{\ 

A 

Tpr{A,p .>) . ( I 

TpriR.Pi) ' Tpr{R,p,y) . a/(l f^) 


(6b) 


Assuming for the moment (wc prove it later) that the procedure is bound to terminate sooner 


or later either in acceptance or rejection, we have 

OiPi) - Tpr(A,pA I -- Tpr{R,pA 

and 0(pn) - TpriA.pA ^ I Tpr{R,p.) 

and then, eliminating the Tpr"^ from these equations, we get 

0{pi) I - a and 0{p.) - p (7) 

Now, returning to the equations (6), they can be put in the form 

X by H (8a) 

X — by H' (8b) 

where h - log (p, //jg) ' log (^, V/^) (9a) 

H log(( a), [^);log (r/i,^.J and IT -logtad - ['))/log (i/,/f/j) . (9b) 


which shows that the A and R boundaries are parallel straight lines, satisfying the condition (3) 
since p is positive. 

Here we should note a slight difficulty. The points on our inspection boundary must have 
integer co-ordinates; and it will not in general be possible to satisfy (8) with integer values. 
Because of this, we have to replace “ ” in (6) by “ and, following through, w'c find wc 

have to replace “ " by “ and a by (a, I — p) in the first equation of (7), while in the second 

equation we replace “ " by “ and p by (p/d - a)). When a and p are small the effect 

of these modifications is very slight, and wc shall henceforth neglect them. 

By Wald’s method, given the Risk and Safe Points p, and p 2 ^ we can determine b,H and H' 
from (9), and so determine an inspection scheme. Conversely, given b,H and H\ we can 
determine the values of 0{p) for pairs of values, PuP ^y by solving (9) for a and p. In both cases 
the approximations involved arc slight when a and p are reasonably small.* 

* Wald says (1) that he overlooked the possibility of working back from (8) to (7) at first. The con- 
verse procedure was found independently by George W. Brown and Milton Friedman in America, and 
in this country by the present author. 



1946] 


Barnard — Sequential Tests in Industrial Statistics ^ 7 

Simplification of the Wald procedure 

The calculation involved in setting up an inspection scheme of Wald’s type is not at all difficult, 
especially since common logarithms can be used in (9). But the derivation of actual insp>ection 
rules from the lattice diagram may happen to be somewhat tedious, and some simplification seems 
desirable. 

Looking at equations (8), we sec that if we introduce a “ score " 5, putting 

S-.v by 

we can compute S at each stage of the inspection by adding I each time an effective item is found, 
and subtracting h each time a defective is found. The procedure can then be described by saying 
that the score starts at O, and is kept at each stage. As soon as it rises to H, the procedure stops 
and the batch is accepted ; while as soon as S falls to H\ the procedure stops and the batch 
is rejected. 

If, in particular, /?., and are the Producer's 100-1 Risk and Safe Points, we have a fi 
1/101, and then we have H H\ To avoid negative numbers, wc then start the score with a 
‘‘ handicap ” H. Wc add i point foi each eflcctive found, and subtract h points for each 
defective. The process continues until either .S' reaches 2H (batch accepted), or .S' falls to O 
(batch rejected), h is called the “ penalty." 

On this basis we can draw up charts from which the values of H and h for given values of 
/?, and p. can be read directly. The charts actually made are based on 1 00- 1 chances. But if, 
for example, wc required 10 1 chances instead of 100 1, wc simply have to lake H I instead of 
H for our “ handicap." This follows from the formula* (9), and is the reason why we choo.se 
to give the risks in terms of odds rather than in terms of probabilities. Further, if wc wish to 
estimate the O.C. curve for given II and /?, we can do it by reading the charts in an inverse sense, 
as indicated by an example below. Alternatively, of course, for the O.C. curve we can use 
logarithmic probability paper, as already indicated. In fact, it is best to find the middle part 
of the O.C. curve using probability paper, and to find the ends of the O.C. curve using the 
charts. 

The “ Ruinc des Joucurs " 

This simplified version of Wald's scheme can he compared to a game between two players 
R and A. The “ skill " of R is measured by />, and that of A by q \ and the stakes are 1 and h 
respectively. Each player starts the game with H counters, and the game continues until one or 
other is bankrupt. This assumes, of course, that b is an integer; but this is not a serious 
restriction. 

Now, Christian Huyghens, in 1657, proposed the problem: "/I and B each take twelve 
counters and play with three dice on this condition, that if eleven is thrown, A gives a counter 
to B, and if fourteen is thrown, 6 gives a counter to A ; and he wins the game who first obtains 
all the counters. Show' that A\ chance is to B s as 244,140,625 is to 282,429,536,481." 

To solve this problem approximately, using our chart.s, we find first the ratio of chances at 
each throw to be 27 : 15. The “ percentage defective " with which we are concerned is therefore 
(100 I5)'(15 1 27) -- 35-7 per cent. The penalty here is i. From the charts we find, for 
100/7, 35-7 per cent., and b 1, that // 7-5, about. The actual handicap is 12. So the 
actual odds are A to i, where 

log A log 100 l2/7'5 - 1-6 

from which A - 1600, in moderately good agreement with Huyghen’s result. Most of the error 
will arise through errors of interpolation on the charts. 

Huyghens’ answer is, however, exact, which implies he must have had a method for solution. 
The first published solution of Huyghens’ problem was given by James Bernoulli, in Ars Conjtc- 
tandi (1713), and his method directly generalizes to our case : 

Let «(a:) be the probability that A will win when he has x counters. Then, at the next trial, 
A will either win 1 counter from R, with probability </, or he will lo.se b counters to /?, with 
probability /?. Consequently, 


//(a:) =- q . u(x I 1) i p ,u(x — b) 


( 10 ) 



8 Barnard — Sequential Tests in Industrial Statistics [No. 1, 

This is a difference equation for u(x), which has to be solved subject to the boundary conditions 

«(0) =- «(- 1) -6 + 1) = 0 (10a) 

and ui2H) =1 (10b) 

This can be solved in the usual form 

u(x) ^Cit\ 

where the fs are the roots of the auxiliary equation 

/* -- qt^ ^ p 


and these roots can easily be found geometrically. But the solution is rather clumsy. The best 


solution I have seen so far is due to Mr. Burman, who finds 

u(x) F{x)IF{lH) (11) 

where F(x) ~ ')(P9‘) t ~ "2 ~ *)(/’«'’)'“ ' ' ' 

for X > 0, and F{x) 0 for .y 0. 


From ( 1 1 ) wc get the exact O.C, curve of a scheme with handicap H and penalty h, by putting 
0(p)' u{U), 

Sample size 

\i' Px is the Producer’s N — 1 Safe Point, then the average sample size for p y?, is, for N 
large, approximately H(N -- \)IS(N 1 1), where S is the “mean score,” q - hp. This 
approximation is due to Wald. 

If we want the exact distribution of sample size, fo.* any value of /?, we notice that the general 
term in the scries (5), 

NXOP ) . q'p^ 

is the probability that the batch will be accepted when the sample si/e is (.v i v). There is a 
similar term in (50. So if we regard p and q as independent variables (/.c., if we do not use the 
relation p -i q --- 1), we find the probability generating function for sample size to be 

G{t) Tpr{A,pt,qt) I Tpr{R,pt,qt) 

and the mean sample size, for example, is then G'(l). 

This result will apply in the case of a large batch, whatever the shape of the inspection boundary. 
Mr. Burman uses it to extend his formula for the O.C. curve to cover the sample size, in the Wald 
case. {See p. 98 of this issue.) 

Alternatively, we can extend Bernoulli's method (as Laplace did), and let it{x\y) be the 
probability that A wins in exactly .v more steps, when he has .y counters. Then 

u(x,y) q , u{x 1 l,y — 1) I p . uix —*b,y — 1) 

and the boundary conditions now involve r. An interesting solution of a special case of this 
equation was found by the Rev. Leslie Ellis.® 

The fact that the mean sample size for the Wald test, as found by Mr. Burman, is always 
finite, constitutes a proof that the procedure is bound to terminate eventually. More strictly, it 
follows from Markoff's inequality that the probability that the procedure will go on for ever is 
zero. But the probability that the Wald procedure will go on for more than n steps is finite, 
however large n is. And so it sometimes may be desirable to “ close ” a Wald scheme, by putting 
an upper limit on the sample size. The effect of such a “ closure ” is often not serious; it has 
been studied in detail by Miss Stockman and Mr. Armitage, and their results are published else- 
where in the present issue of this Journal. 

II. General Sequential Tests 

Diffusion problems 

The problem of the game we have just dealt with is similar to a linear diffusion problem, 
where a particle starts out from an origin O and, at successive unit intervals of time, jumps either 
to the right a distance 1, or to the left a distance b. The particle is constrained to lie in a line, 
and we have two absorbent boundaries, at distances H and — H from O. 



1946] 


9 


Barnard — Sequential Tests in Industrial Statistics 

In diffusion problems usually consider that the interval between successive jumps is small, 
and that the distances, 1 and moved at each jump, are both small compared with H, These 
assumptions enable the problem to be reduced to a differential equation, instead of a difference 
equation. 

This leads us to inquire what happens to our equation (10) when wc make such limiting 
assumptions. , 

Introducing the difference operator A, by 

A/y(a') — f 1) ■ //(jc) 
so that (1 -f A)*w(a) u{x h b), we can write equation (10) as 

{^A i- p((l I A)'* -- l)}w(jr) 0 

Now, if Aw(,r) and {pb'^ q) are of the first order of small quantities, and we neglect differences 
higher than the second, then, to the second order, the equation is equivalent to 

liipb'^ t ^)A-//(a') 1 {q bp)Aif{x) -- 0 

and (q — bp) is the “ mean jump in the unit interval of time, while (.12 ^ iq i pb'“) is the 
second moment of the jump. Retaining the heuristic character of the argument, we now replace 


A by d/dx, and so get the differential equation 

u"(x) i Xii'ix) -0 ( 13 ) 

where X — IiiJijlo. This is the “ steady state ’’ differential equation for linear diffusion. If we 
introduce more general “stopping points," K and — L, instead of H and - //, the boundary 
conditions become 

u{K)- \,u{~L) 0 (13a) 

Solving this equation, we find, for the probability of acceptance 

it{0) - (1 -- ^ f{X) (14) 


We refer the reader who wishes for a rigorous statement of the conditions of validity for this 
approximation to Khintchine (5), where a rigorous proof is given. 

Linear sequential rests 

The relationship of the sequential inspection method to the linear diffusion problem is useful 
in two ways. First, it suggests an extension to two-, three-, or more-dimensional diffusion 
problems; and these problems are seen to be related to those inspection problems where we arc 
not merely concerned to classify components as “ effective “ or “ defective," but we require to 
classify them separately according to two or more properties, such as length and diameter of rods. 
\n such cases the plane inspection diagram no longer serves to represent what happens, and we 
have to have a three- or more-dimensional diagram, with one axis for each of the component 
properties wc are inspecting. 

Secondly, the diffusion analogy leads us to generalize in another direction — to consider 
sampling from distributions other than the binomial ones we have so far dealt with. 

In fact, suppose we sample from a population for which the likelihood of the observation 
X — X is Pix,i)), where 0 is an unknown parameter in whose value we are interested. Assuming 
suitable conditions about P(xfi) — such as, for example, that for each value of a', P(a',0) is a con- 
tinuous function of 0, strictly increasing for 0 < 0(a) and strictly decreasing for 0 > 0(a) — we 
can construct a test of the hypothesis //„, which says that 0 - O^, as follows : We take as origin 
some “ base point ’’ a-q, such that the first moment ix,(0o) of the distribution P(a:,0o), about this 
origin, is positive, while the second moment about this origin is [^^(Oo). Referred to this origin, 
we take the cumulative sum of observations -jr* to be our “ score,” S. Then, with stopping 
points K and L, we accept Hq as soon as S reaches /f, and reject as soon as S reaches — L, 

Under these conditions, provided that ji., and are small compared with K and L, we can 
calculate the significance level a of our test from the formula (14), approximately, 

(15) 

where Xq 2(X|(0o)/iXa(0o) 

and we can adjust Xq, K and L so as to obtain any desired value of a. 



10 


[No. 1, 


BARNARD—SequeNtiai Tests in Industrial Statistics 

In this way wc test the hypothesis 0 - Oo against alternatives 0 < O^. 

If, for example, we are concerned with normal populations with unit variance, and 

/»(.v,0) -- ' exp- J(x - 0)-’ (16) 

A zt: 

if Go is positive we can take the origin to be x - 0, and then • 

Xo 20,,; 1 i (17) 

and if we now take K ( 1 />.„) log, «, and L (I Xo) log, h, we lind 

a - (I a);(^\ — ah) 

Linear sequential tests of this kind can be made to cover a wide variety of statistical problems. 
For example, we could, if necessary or convenient, test the variance of some (not necessarily 
normal) population, basing our lest on, say, the range w of samples of three, w then replaces 
X above, and n replaces Go. 

A linear sequential test (L.S. test) is, in general, a test which is based on a score consisting of 
the cumulative sum of the observations. Such tests will, for reasonably large samples, always be 
governed by the differential equation (13), and we always have three parameters at our disposal 
— the “ base point," and the two stopping points- -for fixing the significance level a. 

lVald\s prohahility ratio test 

The line of development we have pursued so far is distinct from Wald's. He begins by con- 
sidering a situation in which we wish to test two hypotheses, //„ and Hi against each other. The 
Hj ij 0,1) specify the likelihood function Pj{x). If the iih observation is X ^ a*/, we put 

Zi ~ log{P if{xt)/P 

the logarithm of the likelihood ratio, and C„ is then the cumulative sum of the up to the nih 
stage. The sequential probability ratio test (S.P.R. test) is then defined by the inequalities 

(//,), logi?<C<log/f, (//„) (19) 

which mean that Hq is accepted as soon as C,, reaches log A, while Hi is accepted as soon as 
C„ reaches log B. 

By an argument of which we have already given an outline on page 6, Wald then shows that 
the probability of accepting Hi when Ho is true is a, and the probability of accepting //„ when 
Hi is truc’is p, approximately, provided that 

A (1 “ P)/a and B p/(l - a) 

If we apply this to the case of two normal populations, each with unit variance, and with 
Ho saying that the mean is f G,,, while Hi says that the mean is — 0„, we lind 

log, {Po{Xi)!PAx )) ^ 2Go . Xi 

so that in this case the P.R.S. test is an L.S. test. It is in fact the same test as that indicated 
in the previous section, provided we put, approximately, for G„ small, 

X„ - 20o/l 4- G-o 2Go 

and we require of our L.S. test that 

A Xo) - p (20) 

which is the condition that our L.S. test, besides giving a probability a of rejecting Hq when Hq 
is true, should also give a probaljility p of rejecting Hi when Hi is true -assuming we accept Hi 
when we reject Ho. and vice versa. 

Thus with the P.R.S. test, as compared with the L.S. test, we use up two degrees of freedom, 
in fixing both a and p. But with the general L.S. test we have three degrees of freedom available — 
the choice of the two stopping points K and L, and also the choice of the “ base point,” or origin, 
a and p account for the two stopping points, but what, with the P.R.S. test, has happened to our 
third degree of freedom, the choice of the base point? 



1946] 


BARNARD—Sequentia/ Tests in Industrial Statistics IJI 

Optimum property of the P.R.S, test 

On comparing the L.S. test with the P.R.S. test for normal populations of unit variance, we 
see that, in the case of the L.S. test we fixed our “ base point " at .v 0 fof no better reason than 
mere convenience, On the other hand, with the P.R.S. test of Ho against where Ho gave 
0 — 00 while Hi gave 0 -= - 0^ we were compelled by the P.R.S. procedure to take our base 
point at X 0— midway between the two means, where the probability densities on the two 
hypotheses were equal. 

If, with the L.S. test, we took our base point somewhere other than the origin, the means 
referred to this new origin w'ould be to and /,, and they would necessarily satisfy 

f. [ /i 20o (21) 

and if we now put 

2/,), 1 ; Aj 2/i ; I I (22) 

then our conditions for a and become 

/(>,,) I a, /'( X,) (23) 

The* extra condition imposed by the P.R.S. procedure then becomes 

/« /, (24) 

Now the mean sample size, if //„ is true, is approximately 

.S(, A , /(I 

while the mean sample size if Hi is true is approximately 

A', L /, 

and if we now take O,,. to be (ixed, and A', L, ttnd /,, a„ and >.,, to vary subject to (21), 

(22), and (23), we iind, using undciormincd multipliers, that the conditions that S',, should be 
minimi.scd and that .S', should be minimized give us the further equation (24). 

Thus the P.R.S. test in this case is the “ best " of the one-parameter family of L.S. tests 
satisfying the conditions about */ and in the sense that, for the P.R.S. test, So and Si are both 
minimized. 

In fact, Wald (I) has shown, by a rigorous argument, that no sequential test, linear or not, 
which satisfied the conditions about a and can make 5,, or 5, appreciably smaller than the 
P.R.S. test makes them. 

Condition for the P.R.S. test to he an L.S. test 

This optimum property of the P.R.S. test leads us to inquire, when will a P.R.S. test be an 
L.S. test? What restriction on the form of the one-parameter family of likelihood functions, 
P(.v,0) is implied? 

Since the P.R.S. test is based on the cumulative sum of C/, we must have 
T; - log (P(.v.,0„) P(x,Mi)) X, . </.((), „()i) I (Oo, 0,) 

and if this is to hold for all and for all values of 0„,0,, we can fix 0,, put Oy 0, and 
introduce a transformed parameter r, f/^(0,0,), and then putting P(a',0,) - g{x\ wc see that 
P(x,B) must be expressible in the form 

/^(A-,0) ' g(x) . expixr,). h{r^). 

Putting hix) - I p and v) log, piq gives us the binomial case, while g{x) - e.\p{ - J.x-) gives us the 
normal case. And in general, it appears that, for the P.R.S. test to be an L.S. test, for every 
pair of distributions in the family, the family must be a family of binomial, normal, or “ generalized 
Type III distributions, taking tq as the parameter instead of 0. In particular, it is evident that 
the P.R.S. test for the variance of a normal population of known mean will be an L.S. test. 

Thus, in these cases, the P.R.S. test is an L.S. test, and the optimum property of the P.R.S. 
test carries over to the L.S. test, in the sense that, if we work out an L.S. test for a hypothesis 
Ho, using significance level a, then there will exist an //i, and a p, such that this L.S. test is nearly 
optimum for Ho, Hi, a and p. 

If the P.R.S. test is not a linear test, we shall have to balance the possible gain in economy 



12 Sequential Tests in Industrial Statistics * [No. 1, 

with the P.R.S. test, against the possible gain in simplicity of operation with the L.S. test. The 
situation here is reminiscent of that concerning the range and standard deviation in normal 
samples, where the greater simplicity of the range sometimes outweighs the loss in statistical 
efficiency. 

Effect of deviations from normality 

The argument given above for the differential equation (13), and the more conclusive argument 
of Khintchine,^ show that the effect of departures from assumed normality in the case of L.S. 
tests is not essentially different from their effect with classical tests. In both cases, provided the 
samples are reasonably large, the assumption of normality is not really serious. 

The P.R.S. test does present a new aspect in a way, however, because in making up this test 
there is no “ sampling distribution problem ” to be solved, once and are specified. And 
in industrial work we often know our distributions to be non-normal, and yet we use the normal 
tests, simply because we cannot calculate the true sampling distributions on any other hypotheses. 
Yet with P.R.S. tests, this “ reason of convenience ” for the assumption of normality ceases 
to be valid. 

Another “ reason of convenience ” takes its place, however. Because the argument afiove 
shows that, unless we assume the distribution to be normal, or at least of a generalised Type 
III kind, our P.R.S. test will not be a linear test. So that, in spite of the greater simplicity of 
setting up the P.R.S. test, it is still advisable to assume normality, if possible, for simplicity of 
operation. 

In this connection another point raises itself. With the P.R.S. test, the zero point for z is 
always the point where the probability densities, Pq{x\ Py{x) are equal. This point coincides 
with the point midway between the two means if the distributions are normal, and have the same 
variance, but it will not do so in general. Suppose, then, that we have a pair of slightly skew 
distributions to compare, and we determine the “ stopping points K and L by the formulae (23). 
Where should we take our base point? 

The differential equation (13) suggests we should again take the base point midway between 
the two means, not where z - 0. And Miss Rigg has verified this suggestion, in certain cases, 
by direct calculation of the probabilities involved. 

Other types of sequential test 

In the situations so far considered in detail, we have always been concerned with a single 
unknown parameter. This category does not, of course, exhaust the situations to which 
sequential methods can be applied. 

Wald has considered the general problems of what he calls “ composite hypotheses,'’ and the 
reader is referred to his paper for these considerations. In particular, Wald has developed a 
sequential analogue of the /-test, to apply to the case of the mean of a normal population of 
unknown variance. The calculations involved in the operation of this test, however, are by no 
means as simple as those with L.S. tests, and we are led to ask whether an L.S. lest can be 
devised for the same situation. 

In fact, if we are concerned only with one-sided alternatives — that is, if we wish to test 
whether the mean is zero or not, when we know it will be positive, if not zero, then, instead of 
testing the mean we can test the median. And this can be done simply by observing the signs of 
successive observations. On the null hypothesis, tjie probability that each sign will be “ ” 

should be L while if the null hypothesis is not true, the probability of a “ h ” will be more than J. 
In this way, we can reduce the problem to the simple binomial case. And the surprising thing is, 
that some sampling experiments by Mr. Armitage indicate that, in using an extension of this 
binomial test instead of Wald’s two-sided test, we lose very little in efficiency. Naturally, in 
extreme cases, Wald’s test will be more efficient, but these will not often occur. 

If we wish to test against two-sided alternatives, we can either run two tests in parallel, one 
against positive alternatives and one against negative alternatives, or we can take observations 
in pairs. In the latter case, the null hypothesis asserts that the probability of ( f ,-h) or ) 
should be the same as the probability of ( 4 -,—) or (—, 4 -), so that we can base a test on the 
products of consecutive signs. 

As tests of the median, rather than the mean (if the two are different), these binomial tests 



Barnard — Sequential Tests in Industrial Statistics 13 

have the doubtful advantage of being entirely independent of the form of the distributions involved. 
And all of them, except the last one, tend to be better than the classical /-test, in practical cases. 

Comparative trials and double dichotomies 

Suppose we have two events, Ey and £ 0 . Under certain specific circumstances, has a 
probability Pi of happening: and under certain (possibly different) circumstances, £« has a 
probability Pz of happening. We may wish to compare Pi and p 2 , — to test, for example, whether 
Pi ~ Pit or whether pi = kpz, A trial of £1 and E,, designed to test such a hypothesis, is what 

1 call a 2 X 2 comparative trial, or 2 > 2 trial. 

For example, Ey may represent getting a defective component by one process of production, 
while £2 represents getting a defective component by another process of production. The com- 
ponents need not necessarily be the same. Then we may wish to compare the efficiency of one 
process with that of the other. 

In other circumstances, we may have a set of components each of which may or may not have 
two properties, A and B. For example, a rod-shaped component may be correct for length 
or incorrect for length (not->l); and it may be at the same time correct for diameter (£), or 
incorrect for diameter (not-£). In such a case, we may wish to see whether there is association 
between A and B, An examination of a sample of such objects, to detect such association, if it 
exists, is what I call a double dichotomy. 

Provided that some, but not all, the objects we consider have the property A, we can let pi 
be the probability that an object has £, when we know it has A, and be the probability that an 
object has B, when we know it lacks A. Then in this case, our test of association will be a test 
of whether or not Pi = p^, and in such a case there may be no need to distinguish between 2x2 
trials and double dichotomies. 

Wald ^ has proposed a sequential test, primarily for double dichotomies, but also usable with 

2 X 2 trials, which assumes one of the attributes, A say, to be controllable, so that we can deter- 
mine that a given object will have A, or will not have A, at will. (For example, with our rod-shaped 
components, we might imagine them all to be already sorted into two boxes, according to length, 
A and not- A ; and we can then choose which box we take from. To test the hypothesis Pi~ P 2 , 
we make the observations in pairs, each pair containing one A and one not-^ (allocated at random, 
by tossing a coin if necessary). Then each pair {A, not-zt) may give one of four results: 

(i) (£, B) \ (ii) (not-£, not-£); (iii) {£, not-£); (iv) (not-£, B). 

Results of the first two kinds arc neglected. But (iii) tends to indicate py p-,, while (iv) tends to 
indicate the opposite. And if the null hypothesis is true, results (iii) and (iv) should occur equally 
often in the long run. So, neglecting results (i) and (ii), we can reduce the test whether py — p^ 
to a simple binomial test whether or not the probability of (say) (iii), among all non-ncglected 
results, is i. 

This test can be generalized in several different directions. First, we can test whether 
iPil^i)l{P 2 lQd rjs by testing the binomial probability r/(r -f s) instead of i among non-neglected 
trials. Alternatively, we can use triplets, say {A, not-/!, not-zl), instead of pairs; such a procedure 
is useful for testing py — 2pi when both py and p^ are known to be small. And finally, we may 

notice that although the results (i) and (ii) may be irrelevant in the double dichotomy case, they 

are not irrelevant in the 2 x 2 trial ; and so we may try to construct a test which takes results of 
this kind into account. 

To do this, imagine a game between two players A and £, in which a third person takes the 
part of a “ banker.” The banker initially credits both A and B with H points. Supposing that 
both Py and p^ are small compare<J with J, we construct the rules of the game as follows : 

We take observations in pairs, as before. Then 

if the result is (i), the banker adds i point to A's and B's credits; 
if the result is (ii), the banker subtracts i point from A's and B’s credits; 
if the result is (iii), then B gives a points to A ; 

if the result is (iv), then A gives a points to B. 

The game continues until either ^’s credit or B's credit, or both credits, are exhausted. In the 



14 Baknaru— S equential Tests in Industrial Statistics [No. 1, 

first case, we conclude pi < in the second, Pi > p>^ while in the third case we may either draw 
no conclusion, or conclude that, to sufficient accuracy, Pi ^ p^- 

If u(x,y) is the probability that A will win when his credit is x and credit is y, we get 
the difference equation: 


u(x,y) PiP>u(x I I,.v 1) i qiqMx— 
with the boundary conditions 


\) r Piq.uix \ a,y ~ a) 

1 qxPi.ii{x — a,y I a) 


(25) 


uiO.y) O for all and u{x,0) 1 for all .v • O . . . (25a) 

It turns out that the equation (25) is of “elliptic type," while the boundary conditions (25a) are 
of “ hyperbolic type,’’ and the solution is not unique, although our game cannot go on for ever. 
To overcome this difficulty, we put an upper limit U on the total credit the banker gives at any 
one time, which adds the condition 


uix.V — a ) O for O - A* IJ 


(25b) 


Although the solution is now unique, the equation (25) involves differences of too high an 
order to Ixj readily soluble. To reduce this difficulty, ve modify the procedure slightly. We 
reduce “ t/ “ to 1, and, instead of making the banker add or subtract i point for (i) or (ii), we 
make him “ threaten to do so. Specifically, when the result is (i) or (ii), the banker takes a disc 
at random from a bag containing {a ~ I ) counters marked “ 0,” and I counter marked “ 1.*’ He 
then adds or subtracts the number on the disc from /fs and B\ credits. This leads to the second 
order equation : 


(I -{iP\P 2 ! q\qi).n))u{x,y) {PiP 2 <t)i(ix I \,y i () i {q^q^M^x - l,v - 1) 

I Pxq.u{x ^ \,y I) ^ qxP'Mx l,.v • 1) 


When Px and p^ are small, a given difference in frequency between results (iii) and results (iv) 
is more “ significant " than when px and p^ are in the neighbourhood of i. The effect of the 
reductions in credit is to allow for this. Of course, a should be large compared with 1, otherwise 
we may make a decision on too little evidence. When px and p>> arc about A, the game behaves 
very much like Wald's test. But an examination of particular cases by Mr. Owen and Miss 
Allinson has shown that, for px and p., small, this procedure may be superior to Wald's. 

I have not been able to find a solution of the general equation in closed form, but Mr. Allen, 
of Imperial College, has very kindly undertaken to tackle it by relaxation methods as soon as a 
computing machine becomes available. 

Yet another kind of .sequential test for 2 ^ 2 trials was the fir.st sequential test that occurred 
to me. It is specially useful in situations where the probabilities px.p^ are small, and where we 
have some idea beforehand about the true value of the ratio (Px 'Pi)- for example, in develop 
ment work on an instrument we may isolate at least two causes of failure. A proposed modification 
may be hoped to remove one of these causes, while it leaves the other unaffected. From past 
experience we can often say that if the modification is successful, our failures will be reduced by 
a certain percentage; and we may then try to determine whether our hopes are justified by an 
appeal to experiment. 

The procedure then consist^ in arranging to try the modified instrument and the unmodified one, 
under similar conditions, and to continue to try each until a predetermined number /*, of failures 
of the modified instrument, and a predetermined number of failures of the unmodified instru- 
ment, are observed. Then if p, refers to the modified instrument, and to the unmodified one, 
and are the number of observations made in the two ca.ses, the probability of the pair 
(«i,W2) is 




Oh - D! 
‘I)!(//, - 






{Jhsz J 

(/ 2 i) ! (//a — r..) ! 




while if Px - Pa - P^ (fie probability of getting some pair of results having the same total 
nx F //a - A is 


Ox i 


(A - D! 

/•a - \)HN-{rx \ r,))\ 


^ pri + - (»'i + r,) 



^^4^] Barnard — Sequential Tests in Industrial Statistics 15 

and so the relative probability, on the null hypothesis, of getting the pair out of all results 

with the same total is 

' (N - (ri + r.)) ! {n, ~ U !^2_- D ! 

(Af' - I) ! (^i 1) !(r, - i) ! ' in, - /*,) ! {n, /•,) ’ 

which is independent of p. Hence, if we sum over all more extreme pairs having the same 
and To, we shall get the significance level of our result. 

For example, with 1, r^ ~ 2, if we find — 10» //^ 4, we have the significance level 

2 ! r9 ! 3 » 10 » ^ » 11 M M 

13 T 0 ! 1 !{ 9 T 2 !+ loili > [It^} “ " 

If Pi and Pi are small, w, and fh, will be large, and an approximation becomes desirable. 
Noticing that if X is the number of trials required for 1 failure, 

Pr{X > A') - expi A', a) 1 - y’* c~*dt, 

() 

where Ai — log (I - pi), we see that Xik i is distributed as /- on two degrees of freedom, and 
the approximation is improved by taking Y X 

Now //i — Iri is the sum of r, variables, independently distributed like A] K, and so (//i - lri)lkiri 
is distributed approximately as /“ on 2ri degrees of freedom. Hence 

R Oil l>ri)kiri (iii — lri)kiri 

is distributed as Fon (2ri.2/‘j,) degrees of freedom. On the null hypothesis the A's cancel, and'so, 
for /?,, Hi large we get the rule, to calculate 

and to enter hisher’s F tables with this value of /?, for (2/*, ,2^2) degrees of freedom. 

The test is a valid one, whatever the values of r, and r^. But it is best to fix /*, and /••> 
(«) according to how much experimentation we are prepared to do, on the average, and (A) so as 
to maximise the chance of getting a significant result when pi and p... have the values we hope 
they are going to have. 

The mean values of //j and 11 , arc 

til "" f'x.Pi and Thi - rjj 'pj 

so that, guessing p, and p., we can guess how much experimentation we are likely to do, on the 
average. If, for example, we expect 2 per cent, failures and 5 per cent, failures, respectively, 
and we take O i.r.,) to ^ (1,2), then we should expect about 50 trials of the modified design, and 
40 of the unmodified design. 

It sometimes happens that the cost of trying the modified design and the cost of trying the 
unmodified design are not equal. Then, if v represents the ratio of these costs, we shall want to 
minimize 

. v/;i i H, - i^rilpi) f O'ilPi) 

which, given a guess at pi and po gives us one relationship between r, and r.,. 

The aim (/>) gives us another relationship between r, and When pi and po are small, Ai 
and Aa are respectively nearly equal to p, and p,. So that if (Pi/Pa) 0, then R/'O is distributed 
as F. Hence, guessing 0, we can guess the distribution of /?, and so guess the likely chance of 
obtaining a significant result. If we now maximize this chance, we have another condition on 
ri,z'2, which serves, together with the first condition, to determine ri and r., uniquely. 

The actual estimation of the 'best values of /*, and rg is most easily done graphically. A full 
explanation would be lengthy; I hope the method is sufficiently indicated above. 

It should be noted that if we guess badly at the values of Pi and p., all that happens is that 
we let ourselves in for more experimentation than we need have done ; the validity of our results 
is unimpaired. 

By entering the F tables with R/O instead of /?, of course, we can test whether p, =- Opg, instead 
of Pj = P 2 , if we so wish. Another generalization is to the case of a 2xn trial, where, provided 
PiyP 2 ^ ’ Pn are all small, we can use Bartlett’s test. 



16 Barnard — Sequential Tests in. Industrial Statistics [No. 1, 

We may notice, as a curiosity, that the true probability function of R is zero at all irrational 
points, and non-zero at all positive rational points. Further, the value of this function at the 
point iyls) depends essentially on the position of (r/s) in the Favey series. 

Ill, General Inspection Problems 

Qualitative batch inspection 

We now return to the problem we began with — that of determining a suitable inspection 
scheme for batches of components, which we have already seen as the problem of determining a 
desirable acceptance and rejection boundary on the inspection diagram. 

Although we have headed this section “ General Inspection Problems,” in fact we shall deal 
spccilically only with cases where three restrictions are satisfied : 

ia) We assume the classification of components into “ effectives ” and “ defectives ” is 
a simple one, so that the two-dimensional inspection diagram is applicable. We also leave 
aside any statistical questions which may arise from the possibility that the inspection test 
used is not directly a “ user test.” For example, with the ” tropical testing ” of radio com- 
ponents, we do not actually test the components by taking them for two years to the 
tropics; and since we cannot do this, many statistical problems arise in connection with a 
substitute test, into which we do not enter here. 

(/;) We assume that the product is inspected in self-contained batches, of a fixed size B, 
each batch being the subject of a single inspection judgment. This excludes the case of 
“ continuous production,” 

(r) We assume that the components in the batches are thoroughly mixed when the batches 
come up for inspection. 

The process curve 

With the assumptions we have just made, each batch will, for our purposes, be uniquely 
described by the value of the fraction defective, p. And if we imagine the actual fractions 
defective of batches to be recorded over a period of time, we can form a histogram by plotting 
fractions defective against frequency of batches having this fraction defective. The curve obtained 
in this way we call the “ Process Curve ” over the period in question. 

In order to avoid awkward summation signs, we assume the batch size to be large enough 
to permit a continuous approximation. We may then represent the P curve by a function P(p), 
so that P{p)dp is the relative frequency of batches with fraction defective between p and p 1 dp. 

We usually do not know the precise shape of the P curve, but we can often discover its general 
form. This will be determined partly by the properties of the production process — whether it is 
stable or unstable, whether the machines involved are working close to, or tar from, their natural 
tolerances, and partly by the actual batching procedure employed. 

The batching procedure is important. If we bear in mind that the general purpose of 
inspection is to separate sheep from goats, it should be obvious that a rational batching procedure, 
which secures that “ bad patches ” in production are kept isolated from the good patches, will 
often go half-way towards solving the inspection problem. No matter what sort of plan we 
use — sequential or non-sequentiaf any such plan will find it easier to separate really good batches 
from really bad batches than to sort out the not so bad ” from the “ not so good ” among a 
set of mediocre batches. We can sum this up by saying that the “batching clause” of the 
inspection scheme should be designed to secure that the P curve is “ well-separated " — multi- 
modal if possible, and with modes as widely separated as possible. 

Apart from their direct value in production shops, one of the principal virtues of running 
“ Quality Control Charts ” is that they enable a rational batching procedure to be carried out. 

Acceptance sampling and rectifying inspection 

We now have to draw a distinction between those inspection plans in which the actual process 
of inspection makes no difference to the fraction defective in the particular batch being inspected, 
and those in which the fraction defective may be altered by the process of inspection. We. have 
already called inspection schemes of the first kind. “ Acceptance Sampling Schemes.” Schemes 
of the second kind we call “ Rectifying Inspection Schemes.” 



1946] 


17 


Barnard — Sequential Tests in Industrial Statistics 

In Section I, where we assumed the batch size N to be effectively infinite, and where our 
decision was restricted to acceptance or rejection of the batch, we considered an A.S. scheme. On 
the other hand, the well-known Dodge-Romig Single and Double Sampling Inspection plans are 
examples of R.l, schemes ; because, with these plans, there is a provision for loo per cent, inspection 
of doubtful batches, and it is assumed that when loo per cent, inspection is carried out, the 
defective items are replaced by effective items, and so the batch fraction is improved. 

Broadly speaking, an A.S. plan will be in order if the cost of inspection of items is large com- 
pared with the cost of their production. For example, if the test involved is a destructive one, the 
cost of inspection must be at least equal to the cost of production ; and in such a case we must 
use an A.S. plan. The idea of loo per cent, inspection in such a case is clearly ridiculous: it may 
relieve the consumer’s feelings, but it will do little else. 

The inspection properties of an A.S. plan will be specified by the O.C. curve of the plan — that is, 
by the function Cl(p), which gives the probability of acceptance of a batch whose fraction defective 
lies between p and p 4 dp. 

On the other hand, the inspection properties of an R.l. scheme will be specified by a matrix, 
or a function of two variables, K{p,p'), where K{p,p')dpdp' rcpresenls the probability that, if a 
batch is presented to the inspectors with fraction defective lying between p' and p' i dp\ it will be 
passed out by them after having had its fraction defective reduced to between p and p 1 dp. 
The matrix K{p,p') is called the O.C. matrix. 

The outgoing quality curve 

The effect of a given inspection scheme on a process with P curve given by Pip), in the case 
of an A.S. scheme is given by 

Q(p)dp 0{p)P( p)dp J‘ 0( p)P{p)dp (26) 

while with an R.f. scheme, the cITcct is represented by 

Q(p)ilp - j'K{p',p)P{p''Ufp' .(lp'J'j'K{p',p)P{p')clp'clp .... (27) 

0 V 0 

Thus, with an R.l. scheme, as compared with an A.S. scheme, the function of one variable, 
0(p) is replaced by the functional operator 

j K(p\p)( )dp' 

The function Qip) represents the relative frequency of outgoing batches, whose fraction 
defective lies between p and p i dp. The curve of Q{p) against p is therefore called the Outgoing 
Quality Curve, or O.Q. curve. 

The consumer's quality curve 

The O.Q. curve gives the relative frequencies of batches of various qualities as passed by the 
inspection scheme. But it is not necessarily the curve in which the consumer is interested, becau.se 
he may not be directly interested in the quality of single batches. 

Suppose, for example, we have a consumer who uses the components inspected as parts of a 
larger assembly. What this consumer will be chiefly interested in is maintaining a steady flow 
over his production lines, and not having to shut down every so often through finding he has no 
“ effective ” components in stock. Thus the group of items on whose quality this consumer is 
interested will be a group who.se size is determined by the number of components he habitually 
carries in stock. This group of items, in whose quality the consumer is primarily interested, is 
what we call the “ Consumer’s Lot.” 

The “Consumer's Lot” size will be determined by the consumer’s requirements; while the 
Producer’s Batch size will be determined by the producer’s convenience, and by the principles 
already indicated. So the Consumer Lot and the Producer Batch will not always be the same 
thing. 



18 Barnard — Sequential Tests in Industrial Statistics [No. 1, 

Sometimes there may be several different consumer lot sizes in which the consumer is interested 
— if, for example, he runs several factories, with store-houses of different capacities. 

In general, we assume that the consumer lot size L is an integral multiple of the batch size B. 
If LiB — /7, then the Consumer's Quality Curve (C.Q. curve) will be obtained from the O.Q. curve 
by //-tuple convolution.” That is, the distribution of consumer lot quality is obtained from 
the distribution of batch quality by finding the distribution of means of samples of n from the 
O.Q. curve. 

In particular, when n is very large, the C.Q. curve will have a single ordinate, at the mean of the 
O.Q. curve; 

Q f PQ(P)^^P 

0 

Thi^ Q is then called the Average Outgoing Quality (A.O.Q.). If the consumer is interested only 
in (J, this corresponds to the case considered by Dodge and Romig under the heading ” Average 
Lot Quality Protection.” 

Another case is when n 1, and the consumer is anxious to avoid having more than a certain 
proportion (say, one tenth) of his lots with quality worse than a certain “ tolerance quality ” f. 
His requirements will then take the form that 

1 

j Q{p)dp must be less than 01 
/ 

This value ” t ” then corresponds (nearly, but not exactly) to Dodge and Romig's “ Lot 
Tolerance Quality.” 

In practice, the requirements wc have to meet will take the form of requirements about the 
shape or form of the C.Q. curve. These will then have to be translated into requirements on the 
O.Q. curve. Then, having found the shape of the P curve, we can determine the form of the 
O.C. curve or the O.C. matrix which is necessary to meet requirements, using the relationships 
(26) and (27). 

The sample size matrix 

In designing an inspection scheme to have a given O.C. curve or O.C. matrix, we shall also be 
concerned with the amount of inspection involved. For a sequential scheme, the sample size 
will not be always the same number, even when the fraction defective p remains constant. For 
constant /?, we shall get a sample si/e distribution. And for various values ofp, we get a family of 
sample si/e distributions. 

Thus we have a sample si/e matrix, s{x,p), such that s{x,p)dxdp represents the probability that 
the sample size used will lie between x and .v I- dx, when the fraction defective lies between p and 
p f dp. 

The actual sample si/e distribution will depend on the process curve, and will be given by 


3 

S{x)dx ^ j's{x,p)P{p)dp . dx (29) 

<1 

and the mean sample size is ^ 

— 00 00 1 

S - J’ xS{x)dx ^ j'j’xs{Xyp)P(p)dpdx (30) 

U 0 0 


while the variance isy’ (.v S)-S{x)dx. 

0 

In practical problems we shall wish not only to minimize S — /.e., to minimize the total amount 
of inspection in the long run — but also we shall not want the variance of 5 to be unduly large, 
since this would imply heavy fluctuations in the amount of inspection required per batch. This 
would mean fluctuating demands for inspection labour, or else an unsteady flow of batches 
through the inspection, and either of these would increase inspection overhead costs. 



1946] 


19 


BARNAKD—Sequential Tests in Industrial Statistics 

General problems of design 

It should now appear that, given only the consumer's requirements, in the form of restrictions 
on the acceptable O.Q. curve, the problem of finding the “ best '' type of inspection plan has no 
solution. In fact, there is no such thing as a “ good " type of inspection scheme, if by good ” 
we mean “ always good." The suitability of a given type of inspection scheme in a given situation 
depends on a host of circumstances, among which we may enumerate the P curve, the relationship 
between batch and lot, and the relative importance of overhead and running costs in the inspiection 
department. 

In particular, the P curve is of fundamental importance. The expressions (26), (27), (28), (29) 
and (30) all indicate that the results of our scheme depend essentially on the form of the P curve. 
And we shall shortly see that sometimes the P curve may be of such a kind as to make nonsense of 
some inspection schemes which, nevertheless, are quite good In other cases. 

Use of the inspection diagram 

In working out special inspection schemes to suit special circumstances, we have found the 
inspection diagram to be a great help. 

We have to make an addition to the inspection diagram as already outlined, in the case of 
R.L schemes, because it is a necessary feature of such a scheme that the batch si/e B should be 
finite. This can be done by drawing in the line 

-Y y B (31) 

and then, when we are dealing with a batch whose fraction defective is /?, this means that all our 
permissible inspection paths must end at the point Q on this line, whose co-ordinates are iqB^pB). 
Then, whatever the acceptance or rejection boundaries, we shall always have, with the notation 
of section I, 

PiX) N'iOX). N{XQ).mOQ) (32) 

where N(XQ) is the number of paths from .V to (?, regardless of the inspection boundary. 

As one application of the inspection diagram, we can determine the general form of the function 
K{p,p') for a quality inspection scheme. Because, if we accept at the point (.v,y), and we assume 
(as always) that we replace defective items by effective ones as soon as we find them, then 
acceptance at (.v,j ) means reduction of the fi action defective from /?' to p, where 

P p' (V; B) 

and the sample size used in the case will be 

.s .V I y 

which shows that the general form of K{p,p) must be, for a given inspection boundary, 

K(p.p')- Up' p)(^^ / (/b) 

where Lix) is an increasing function of .v which vanishes when a' is negative. Thus the functional 
operators we are concerned with in R.I. schemes are generalisations of the " closed cycle " 
operators much studied by Volteria.* 

Similarly, if we write the equation to the /i boundary as 

A t y ri(y) 

and if we have no R boundary (as in Dodge and Romig's schemes, where we replace rejection by 
100% inspection), then the probability of arriving at (A,y), and accepting there, is the probability 
of using the sample size s, and so S{x -j- }\p) must satisfy 

sump' ~ P)hp') f<iP^P') (P 0) (34) 

provided that our scheme is “ contracted," so that there is only one point on the boundary giving 
the sample size s, 

* In fact, K defines a “ transition operator ” on a “ space (/</..)." See Birkhoff, " Lattice Theory " 
(1940), p. 133. 



20 


Barnard — Sequential Tests in Industrial Statistics [No. 1, 

As another application of the general formula (32), suppose we have a P curve which is 
binomial in form, so that the a priori probability that the end-point will be C? (q'BfP'P) 
say, is 

-/>)* = yv(O0.pV 

and the a priori probability of reaching X (x,y) is 

NXOX) . N(XQ) . p'Y N\OX) . p^q^ . N{XQ)p^ 

Now, X is the number of defectives found by the inspector, while D — a', is the number remaining 
in the batch. If we now consider a single sampling scheme, with the boundary made up of points 
on the line 

A -f - /I (35) 

we shall have y — n x 

and E - y (B n) “ (O — x) 

so the a priori probability of reaching the point P on the line (35) splits up into two factors, one 
of which contains only the number of defectives found in the sample of //, while the other contains 
only the number of defectives found in the batch. 

This means, that if our P curve is binomial in form, and we take a single sample of // from 
each batch, then the number of defectives found in the sample is distributed entirely independently 
of the number of defectives left in the batch. In practical terms, if our P curve has the binomial 
form, and we take single samples of «, accepting up to k defectives and rejecting more than k 
defectives, then the fractions defective remaining in the batches we reject will have exactly the 
same distribution as the fractions defective remaining in the batches we accept. Our O.Q. curve 
is exactly the same as our P curve, and our single sampling scheme is a pure waste of time. 

Mood has obtained a similar result by another method. He also shows that the correlation 
between defectives found and defectives remaining is negative, if the P curve is more lepto-kurtic 
than the binomial curve; while it is positive only if the P curve is more platykurtic than the 
binomial curve. This further serves to emphasize the importance of having a “ well-separated ” 
process curve. 

Another application of the inspection diagram is made in what we have called the “ matrix 
method ” of determining the properties of any inspection scheme. This depends on dividing the 
interior of the inspection boundary by lines at 45“ to the axes, into “ slabs.’' It can be shown 
that each slab corresponds to a “ transition probability matrix,” which can have only one out 
of four possible forms. The total effect of the scheme is then obtained by multiplying up these 
matrices. Mr. Armitage and Miss Stockman have applied this method to the examination of 
closed ” sequential schemes in the paper already referred to. 

Finally, the inspection diagram is of direct practical use in carrying out sequential methods of 
inspection. It suggests the construction of an electrical or mechanical network which will serve 
to carry out automatically the functions of counting, and registering acceptance or rejection as 
required. The possibility of making machines of this sort, which absolve the inspectors from the 
necessity of keeping running records of results, means that we need not be deterred from making 
up sampling schemes by the thought that they may be too complicated to carry out in practice. 

Conclusion 

May 1 conclude with some speculative statements, designed to provoke controversy? 

First, with regard to sequential tests in general. The startling simplicity and efficiency of the 
P.R.S. test, which evidently was evolved by Wald by an extension of Neyman and Pearson's ideas 
on testing hypotheses, would seem to be a strong argument in favour of the adoption of these 
ideas as a basis for developing statistical tests. Even if we approach the matter from the point 
of view of L.S. tests, we have seen that the sequential method leaves us with an extra degree of 
freedom in the choice of test to use, after we have satisfied the requirements with regard to 
significance level and minimum sample size. In fact, for all the tests I have described, except 
only the test for comparative trials given on page 14, some idea equivalent to Neyman and 
Pearson's idea of ” power ” seems necessary to give us a unique test. 

But it may be that those who reject the ideas of Neyman and Pearson would in any case reject 
many sequential tests for another reason. For it seems difficult, if not impossible, to apply a 



Barnard — Sequential Tests in Industrial Statistics 21 

physical process of randomization to an “ unclosed ’’ sequential procedure. Wald's test for 
double dichotomies is an exception, but in general it seems we can apply sequential procedures 
only to cases where some sort of randomization is already imbedded, as it were, in the problem 
considered. For example, in inspecting components from a box, we suppose the components 
to have been already mixed up in the box, so that the sequence of results we get on inspecting 
them can be considered to be a random sequence. 

On the other hand, it may be said that the notion of a sequential procedure^ which is intimately 
bound up with every sequential test, really amounts to a change in the Neyman-Pearson theory. 
If so, the change is for the better, in that the actual procedure used in carrying out the experiment 
now does determine the test to be used. 

Next, with regard to the theory of statistical estimation by “ fiducial limits" or by “confidence 
intervals," it has always seemed to me to be a defect of both these methods, that the distance 
between the upper and lower limits for the estimate were dependent on the results of the trial, 
and could not be fixed in advance. May we speculate that the extra degree of freedom we have 
with sequential procedures may enable us to estimate parameters within limits fixed in advance? 
Certainly Haldane's work - is a good step in this direction. 

Thirdly, may 1 suggest that there is a rich, almost virgin field for exploration to be found in 
studying the rules of games of chance and of skill? Many apparent eccentricities of these rules 
are due in reality to sound foresight on the part of the rule-makers. And we may be able to use 
their foresight in making tests for practical situations. To give one example: the rules of tennis 
require that at least four points must be played before the game is over, yet after these four points 
have been played, a majority of two is sufficient for game. This rule presumably is made to 
safeguard against the result being determined before the players have really settled down. Now 
we have a similar situation in sampling inspection, caused by the phenomenon christened by 
Mr. Womersley “ Foreman’s fingers." This refers to the remarkable ability, often shown by 
experienced inspectors, to pick out most of the defectives in a batch in their first sample. To 
guard against this sort of thing, it is often advisable to put a lower limit on sample sizes, similar 
to the limit we have in tennis. 

Finally, in the general problems of sampling inspection we have an example of a practical 
situation in which we not only can, but must, apply Bayes’ Theorem (equation 26). Much earlier 
work on inspection seems to me to have been stultified by an attempt to carry over into this field 
the unmodified results of the theory of significance tests, where we take care to avoid Bayes’ 
Theorem. Of course, as Mr. Kendall has made very clear, there is a difference between Bayes’ 
Theorem and Bayes’ Axiom. But 1 rather think that, in assuming a specific form for the Process 
Curve, often in circumstances where we have little but “ engineering judgment ’’ to go upon, we 
are close to assuming at least a weak form of the axiom, as well as the theorem. 

A ckno wledgemcnts 

My debt to Wald is obvious. Much of the above is based on Wald’s original report on 
sequential tests, published in 1943. His later paper ^ came to hand while this paper was being 
prepared, and it contains much that overlaps with the above. 

My debt to friends and colleagues in the Ministry of Supply Advisory Service on Quality 
Control, where most of this work was done, is very great. At one time or another, almost every 
member of the department, besides those f have already mentioned by name, was drawn into work 
on theoretical or practical aspects. Over us all was Mr. Womersley, who saw immediately the 
possibilities of the new developments and gave us every encouragement. 

Acknowledgement is also made to C.S.O., Ministry of Supply, for permission to publish this 
paper. 

Bibliography 

1. Wald, A.. “Sequential Tests of Statistical Hypothcse.s,’’ Ann. Math. Statistics, 16, 117 (1945). 

2. Haldane, J. B. S., Nature, 155, 49 (1945). . 

3. Todhunter, 1., History of the Mathematical Theory of Probability, London, 1867. 

4. Twe^ie, M. C. K., Nature, 155, 453 (1945). 

5. Khintchine, A. Y., Asymptoticheskie Zakoni Teorii Veroyatnostei, Moscow, 1936. 

6. Ellis, Rev. Leslie, Cambridge and Dublin Mathematical Journal, 1845. 



22 


' Discussion on Mr. BarnarcVs Paper 


[No. 1, 


Discussion on Mr. Barnard's Paper 

Mr. WoMFRsr.FY, in proposing a vote of thanks, said this was an occasion of peculiar pleasure 
to him, becau^ here an enterprise in which he had taken a little part was giving this new or rejuven- 
ated Section its lirst fruits. He hoped this would be the first of a number of papers publishing 
openly what had been done in secret. Mr. Barnard had said one or two kind words about him 
which were not wholly deserved. He did not see how anyone, seeing this sort of thing arise, 
could possibly have failed to encourage it in every possible way. This work, which was done 
at the same time independently by Barnard and his associates here and by Wald and his associates in 
the United States, did open a new chapter in statistical theory. 

He would not say anything about the technical content of the paper because he had seen it so 
many times that he felt he was not in a position to make any criticism. He hoped this work would 
not in future be confined entirely to the problems which arose m industry. Those of them who had 
been concerned with the application of statistics in industry owed a great debt to those working 
in biological and agricultural fields, and they could in this way repay a little of the debt. 

They would all agree that it was a very great pity that Mr. Barnard could not go into the tech- 
nical and industrial detail a little more on this occa.sion, but perhaps he might be able to extend 
the practical end of this work at a future date. 

Dr. BartlI'IT said that he was very glad to second this vote of thanks because, as 
they all would now realize, but as some already knew before the paper was presented, 
the author had played a major part in this country in the development of these sequential 
methods. Mr. Barnard had rather modestly not referred that evening to one of his first 
contributions, which he had described on page 14 of the paper (the section beginning, “ Yet 
another kind of sequential test for 2 2 trials,’* etc.). It was the application of fisher’s 

z test in a rather ingenious way to discriminate small probabilities. That particular test which 
had been very useful (at least, he had found it useful), was actuallv m the hands of war- 
workers before any work from America by Wald had reached this country. He found it useful 
to distinguish hcic, however, between what Wald had called sequential tests and what Mr. Twcedie 
had called inverse sampling methvids. The distinction that he saw, in the important case of the 
binomial distribution which Mr. Barnard had illustrated that evening, was that the inverse sampling 
method was one in which, instead of in the ordinary way taking a lived numbei of ai tides n and then 
considering the number of defective articles, one turned the thing, as it were, upside down, and 
decided to get a fixed number of defectives, and then went on and detei mined the si/e of sample. 
That was the first kind of inverse sampling which was used. iWibly the name “ sequential tests ” 
might - though this was a matter of opinion be reserved for the strict procedure which was suijj- 
gested by Wald, corresponding to his theory, where, as Mr. Barnard had said, by starting from liist 
principles, he was able to formulate a uniquely defined technique. 

However, as Mr. Barnard in his paper had shown, and as Professor Wald had shown in his pub- 
lished paper in the Annals of Mathematical StatistUs^ in the historical notes at the beginning of 
both papers, it was quite clear that a number of workers, during the war especially, had been groping 
towaids this kind of technique. F rom that point of view a rather curious situation arose during 
the war, because there were a number of outside w'orkers. Dr. Case, Professor ,1. B. S. Haldane, 
and Mr. 1'weedie, who were advocating these inverse sampling methods, and it placed those of theni 
who were, so to speak, in (he “ know ” on this rather in a predicament as to whether they should 
adhere to rigid secrecy (like Brer Rabbit, ’’ lie low and say nothing ’*), or should divulge what they 
knew. 

He wished to consider now, rather in a broad fashion, the relations of this technique to ortho- 
dox theory. Mr. Barnard had emphasized in his paper a very inteiesling parallel that existed 
between the theoretical problems in the.se sequential tests and in classical problems which had arisen 
in the past in connection with gambling and, in physics, in the theory of diffusion. Roughly the 
division was that the discrete or difference equations he discussed in his paper had been connected 
with gambling problems, and the approximate large .sample theory which gave rise to differential 
equations had been useful in the theory of diffusion in physics. But both these sides of the theory 
had been really aspects of what, as Mr. Barnard had said, was called “ random walk ” theory. 
He was rather fortunate to read Wald’s original report at about the same lime as he was reading 
a review by Chandrasekhar on random walk theory {Reviews of Modern P/;v.s/c.>, January, IS)43). 
He mentioned this because, while he confined some slight work of his own at the time to large 
sample theory, it did enable him to see these various broad relations. For example, the general 
gam in efficiency for the mean of a normal sample in sequential tests (at least in the region of the 
two points which, a.s Mr. [Farnard had shown, determined the curve) must be a general result which 
approximately applied to any sequential test ; moreover, being linked to the ordinary theory of 
likelihood estimates and the information function, the theory could be immediately extended 
to the case of more than one unknown parameter. This had other applications. For example, 
Mr. Barnard had stressed the value of the linear sequential test where the population had a certain 



1946] Discussion on Mr. BarnanVs Paper 23 

functional form ; this made the test much simpler. That property appeared to him identical with 
the simplicity property of likelihood equations which determined a sufficient statistic. 

Again, later on, Mr. Barnard discussed the efficiency of a median, but one could see that the 
relative efficiency of different statistics must tend to be just the same in this theory as in the ordinary 
theory. The efficiency of a median would tend to be of the order of 64 per cent., as it was in 
ordinary theory. He, too, was therefore rather surprised that the median was as good as was sug- 
gested, because if the gain in a sequential test be considered to be a factor of about 2, it would imply 
that the median of the sequential test would be not very much better than the mean of the ordinary 
test. But he admitted that the transition to finite samples made that figure an approximation and 
no more. 

Finally, he wished to recall the warnings which Mr. Barnard had given them that these sequential 
procedures w^ere not always practicable, and even if they were, one must be careful to take random 
samples. But it was quite clear, apart from that, that this technique was of permanent theoretical 
value, and he thought that Mr. Barnard had done a great service by describing it to them that 
evening. 

Dr. Vajda said that there had been little time to .study the paper, and therefore his remarks 
would probably cover some of the problems dealt with by the author, but it would be a pity if the 
experience of workers in this field were not exchanged. He had been encouraged also b> the fact 
that all this work was to a large extent carried out in wartime, when they could not speak to their 
friends as they would have liked, and thus discover what others thought about ideas which were 
slowly developing. 

He thought the author would agree w ith him that this paper should be interpreted as a means 
of encouragement, and he would like, therefore, to mention a few points which occurred to him. 

The author stated at the opening of the first section, “ We assume that the number of compon- 
ents in the box is large compared with the number of components w'c actually examine. ..." 
The speaker supposed that they must assume that the probabilities did not change during the samp- 
ling— that if they took something out of the box, the relative probabilities remained the same as 
before. Was anything known of what happened— or in what direction changes happened — if 
they were sampling from a finite, not a large, population, and during the process of sampling 
the probability changed ? C ould anything be done on the safe side? He himself had come up 
against this problem in tests in which, once the thing had been tested, it was destroyed and could 
not be used again, but also the remaining probabilities were changed in two ways. If a component 
were found defective, this defective component would no longer remain among these individuals, 
so that not only would the probability of finding the defective be changed, but also the quality, 
the quality depending on whether the defective were found and eliminated or not. 

Another thing which had some connection with the sequential tests had happened in some work 
he was doing when they had no idea whatever in advance what the quality might be, and therefore 
had to start on some assumptions. During the test they found, perhaps, that it was necessary to 
revise the specification, to take over many batches which might have been quite good, even if they 
did not satisfy the original specification, and therefore they had thought it reasonable to develop 
a system by which they " labelled " the batches. Of one batch, for example, they might be reason- 
ably sure that the components were to their specification, but for other batches, they had to 
find an appropriate specification. In the end it w'as decided that it was not necessary to start with 
any specification at all, but during the test they decided which specification could be reasonably 
satisfied by these batches. The inverse might also arise namely, the investigator might start 
with a planned scheme and find that his batches were so good that it was a pity to stop there and 
say, " This satisfies my specification anyway, and therefore I will not go on sampling.” Circum- 
stances might arise in which one said, " I am perfectly prepared to go on if it helps me to be 
reasonably sure that this sample is much better than 1 had assumed.” It might happen that more 
could be stated about these points and more information given from the laboratory. 

Mr. M. C. K. TwELOir, who was called upon by the Chairman but had not prepared a contri- 
bution to the discussion, said he could give a rough idea of how he became interested in the subject. 
At the beginning of the war he was starting research in a physics department which had made a 
large number of electrophoretic measurements on colloids. It had been noticed that the times 
taken by the colloid particles to. cover a fixed distance had some very strange properties. One 
that stood out was that there was a considerable correlation between the variance and the mean. 
He therefore got down to an investigation of the problems involved, a principal one being that of 
inverse sampling with a continuous variate, which, as Dr. Bartlett had said, corresponded to a 
large sample theory in sequential analysis. He .succeeded in expressing the cumulants of a 
continuous distribution in terms of those of its inverse, and later proved the simple relation between 
the cumulant-generating functions. 

On the general matter of Mr. Barnard’s paper he could say little, as, having received it only 
the day before, he had had no chance to read it with the attention it de.served. However, he agreed 
with Dr. Bartlett that the L.S. test and the P.R.S. test were probably equivalent in those distributions 



24 


Discussion on Mr, BarnanVs Paper 


[No. I, 


which Fisher had shown to possess ei sufficient statistic. This comprised a much wider class than 
Mr. Barnard had indicated, and its members had the property that their inverse distributions also 
belonged to the class. 

Mr. a. E. Jonfs said that it seemed to him that the efficiency of the sequential sampling test 
could not be judged without taking into consideration the possibility that no conclusion would be 
reached without loo per cent, sampling. 

In certain circumstances it might be important to know that the probability that the lot fraction 
defective exceeded specification was small. It might not be adequate to say “ the probability that 
the lot fraction defective exceeds specification by more than a certain small quantity is small.” 

With parallel boundaries it would be necessary to limit the sample si/e, and in that proportion 
of cases when no conclusion could be reached, loo per cent, sampling might be necessary. There- 
fore he thought it would be desirable to develop sequential tests of a type that would enable sampling 
to continue indefinitely without having to fix a maximum sample si/e. What seemed to be required 
was an expanding boundary, such that if the lot fraction defective was near to the limit of the 
.specification, the probability of incorrect acceptance or rejection might continue to be some com- 
paratively small quantity, however large the sample. He made the suggestion tentatively, but if 
such a test could be developed it might be more satisfactory. 

Dr. Yatps congratulated Mr. Barnard on his admirable exposition of w hat was virtually a new 
subject, and one which was of considerable fascination to mathematical statisticians. It was always 
interesting and encouraging to find a field of statistics where mathematical theory was of real 
value, the more so because, in spite of all that had been done to improve the methods followed in 
the collection ot data and the planning of experiments, statisticians still spent a great deal of time 
on the rather empirical task of “ making sense of figures.” 

He desired to raise one point on the paper. After developing sequential tests of the ordinary 
type, Mr. Barnard had gone on to discuss the development of sequential tests for observations 
falling into the 2/. 2 type of contingency table. The situation envisaged was that one had a piece 
of apparatus, made some improvement in design, and then wished to test whether that improvement 
had really made the apparatus better. Mr. Barnard had reduced that problem to the question, 
” Is the apparatus better or not?” Slated in that form it could he answered by the ordinary- 
test of significance applicable to 2 < 2 contin^^ency tables, or by a .sequential test, which in 
certain circumstances would do the same kind of thing more efficiently. But the speaker thought 
that the practical problem was not merely whether the improved apparatus was better, but also 
how much better. In fact the problem was one of quantitative estimation, and not merely of accep- 
tance or rejection of a given hypothesis. Consequently it could best be treated by the ordinary 
types of experimental test based on quantitative measurements. 

This question of the postulation of the problem in this case what they realiv wanted to learn 
from the lest--w'as very important, and change in postulation often altered radically the appropriate 
type of solution. There was idways a tendency in any new development of statistics to attempt 
to extend it to problems which in fact were really belter covered by procedures which had been de- 
veloped specifically to deal with those problems. An analogous situation had arisen in agriculture, 
when formal tests of significance applicable to agricultural experiments were first developed. It 
then became the fashion on all occasions to test whether a particular treatment was significantly 
better than another treatment, whereas, looked at critically, the problem was usually one of estima- 
tion; what the experimenter really wanted to know was the quantitative dift'erence between the 
two treatments. Fortunately agricultural experiments were designed to give this quantitative 
information, and the fact that a few irrelevant tests of significance were made on the results was of 
no great consequence ; but if, as in the present case, the tests themselves were so designed that the 
quantitative information was not available, the matter was much more serious. 

As he saw it, sequential tests gave high precision in a particular region but low precision outside 
that region. This was exactly what was wanted v/hen testing material to see whether it conformed 
to a given standard, but it was not what was wanted when testing improvements in apparatus; 
then one required equal precision over a wide range* in fact, over the whole of the range that was 
likely to be experienced in practice. 

At the end of the paper the author had made a number of statements which he said were intended 
to be provocative. He, Dr. Yates, did not wish to enter into controversy on these points, beyond 
stating that he considered that many of these apparently irreconcilable differences of theory 
depended largely on differences of approach, and were likely to be gradually synthesized into a more 
general theory. Dr. Bartlett, for example, had already drawn attention to the relationship with 
* estimation. Mr. Barnard had raised the question of Bayes’s theorem 

as if this were a heresy which it was dangerous to utter. But in fact in the sequential type of prob- 
lem there existed a distribution of the means (or other parameters) of the various batches, and it was 
perfectly reasonable to consider what would happen when this distribution assumed 
different forms. 



^946] Discussion on Mr, BarnanVs Paper 25 

Mr. Anscombe desired to mention briefly some work with which he had been concerned at the 
Supply following upon the work of which the author had spoken. It was concerned 
with the first problem he discussed — namely, sampling from a bulk of items to decide whether the 
proportion of defective articles was allowable. A sequential scheme of this kind was defined, in 
Mr. Barnard’s notation, by two handicap numbers (which at the beginning were denoted by H 
and H' and later by K and L) and a penalty number h. Furthermore, if the scheme was closed at 
some maximum sample size, that would be a further defining number. 

The work the speaker was referring to had consisted of taking specimen schemes and working 
out numerically what their properties were — that is to say, their operating characteristics and aver- 
age sample size curves, using the formulae developed by Mr. Burman, Miss Stockman and Mr. 
Armitage. Later on a contract for further work was placed with Dr. Hartley of the Scientific 
Computing Service. 

Jf it was desired to apply one of these sequential schemes in practice, there were two questions 
on which some guidance was needed. Normally should we choose equal handicap numbers or 
unequal, and, if unequal, how far should they be unequal? Again, should the sequential test be 
closed at some maximum sample size, and, if so, how early should it be closed? The charts which 
Mr. Barnard had himself prepared were for unclosed sequential schemes with equal handicaps. 
But there was no reason why the handicaps should be equal, nor why the schemes should be un- 
closed. To try to obtain any general guidance on the type of scheme to use, it was necessary to 
examine these numerical cases, which furnished some idea empirically of when to use which sort of 
scheme. 

He would not attempt to summarize the results of the investigation, but would mention one 
point. In discussing the operating characteristic curve, the author said that if drawn on logarith- 
mic probability paper most operating characteristic curves were sufficiently well represented by a 
straight line. In other words, they were defined by two points, a “safe point" and a "risk 
point.” That was true as a first approximation, and, of course, it was a very convenient fact, 
but it was not true exactly. Indeed, some operating characteristic curves could look very far from 
straight lines, and this factor did enter into the question of w hich type of scheme to use. Suppose 
one wanted to compare two sequential schemes, say a closed scheme with unequal handicaps 
and an unclosed scheme with equal handicaps. If the operating characteristic curve were defined 
by two points, a scheme of the second type could be found exactly equivalent to the first scheme, 
r.e., having the same operating characteristic. Then it might be found that for one part of the 
quality range the first scheme was better— /.^^, gave lower average sample sizes- -and for another 
part of the range the second scheme was better. To decide which would be belter in a particular 
case it was necessary to consider the process curve and calculate the average amount of sampling 
to be expected. But the operating characteristic curve was not precisely defined by tvyo points, 
and therefore any two schemes would always have a slightly different effect on incoming bulks. 
No two different schemes could have the same operating characteristic curve precisely, and the dif- 
ference would be great enough to confuse this particular problem. 

In order to reach a definite conclusion as to the superiority of one sampling scheme over another, 
it seemed necessary to take into consideration not only the process curve and the various costs of 
inspection but also the cost of accepting and the cost of rejecting a bulk of any given quality, 
so that the whole transaction of the consumer buying from the producer could be costed in a single 
figure. 

The work to which he had referred would be issued shortly from the Ministry of Supply, and he 
hoped that further work would shortly be completed and published in the usual way. 

Mr. Bosanquet, who had been working over somewhat similar tests himself, asked whether 
the lattice scheme which the author had depicted on the blackboard, while all very well for a 
" Go ” or " Not go ” test, could be used if the path were actually measured. Could cumulative 
error be plotted? 

Mr. Barnard, replying to the discussion, thanked the audience for their kind reception of 
his disjointed exposition, and said that he would deal with most of the questions in writing. 
One or two points had been made, however, to which he could reply quickly. Dr. Vajda's 
point about the small batches, when the assumption set out at the beginning of section T of the 
paper was not true, was actually dealt with in section III und^r " General Inspection Problems.” 
In particular it was a remarkable result, which should be called Mood’s theorem, that when 
inspection took place by destruction methods from batches and the process was binomial in 
shape, it was useless to inspect at all with a simple inspection scheme. That emphasized the 
importance of the precise shape of the P curve. 

Concerning Mr. Jones’s point as to the possibility of having boundaries which were not parallel 
but which diverged, he had for some time thought that this would be a fascinating subject for 
exploration. He had not mentioned it in the paper because it had not been explored, and he 
wished someone would set about it. Provided the boundaries were parallel, he had shown that 
the mean sample size was finite, and therefore the probability of going on for ever was nought. 



26 


[No. 1, 


Discussion on Mr. Barnanrs Paper 

In reply to Dr. Yates and Mr. Bosanquct these sequential methods could be used with tests 
involving measurements, as well as with tests involving simple classification by attributes. 
That was not very clearly set out, but it was mentioned. It was possible to set up tests in that 
form which were based on -the cumulative sums of scores. He thought the simplest formula 
for that sort of thing was obtained by assuming the distribution of scores to be normal and 
then applying the formula set out on page 10. 

Mr. Barnard later added the following comments in writing : 

C oncerning Dr. Yates's comments, it is quite true that, in testing two designs of an 
apparatus, we arc not usually interested in the mere question whether one design is better than 
the other. But many practical problems in such connections cun be reduced to the problem 
“Should we adopt this design or that?" In other words, we wish to decide between two 
alternative courses of action. For example, we might consider the adoption of a complicated 
design rather than a simple one as justified only if the complicated design is twice as good as 
the simple one, on account of the higher cost of production. But such questions can be answered 
by sequential methods just as well as by classical methods even if the experiment itself is used 
to determine the increase in cost of production as well as the actual improvement m performance 
— provided there is sufiicicnt forethought. And always, of course, provided the experiment is 
not like the one on trees which I mentioned in the introduction. 

It should also he made clear that the data obtained m a sequential procedure can be used 
to derive estimates of relevant quantities, provided the proper formuhe arc used. The results of 
Dr. C asc and Professor Haldane indicalc that sequential methods may even have some advan- 
tages here. 

With regard to Dr. Bartlett's points on the applicability of the theory of likelihood estimates 
and the information function, we have to remember that there are really two “ sample sizes " 
involved in sequential procedures. We may take our obseivaiions n at a time, and use a statistic 
7" calculated from these n as the basis of our test procedure. To ariive at a decision, we may 
then have to take N sets, each of// observations. So when we speak of “ large-sample theory " 
we may mean that n is large, or that N is large. If n is large, then there is no doubt that the 
likelihood theory will apply to the choice of statistic /', subject to the usual conditions. But if 
N is large, while n is small, I think the likelihood theory will apply only if the sequential test 
used is a linear one. 

In particular, the " binomial " type of test indicated in the paper as a possible alternative to 
the '’ sequential /-test " is not based on taking the median as statistic instead of the mean. The 
function of the observations which is used is not even a continuous one, so that the dilTerential 
approximations use in the theory of the information function are inapplicable. 

Finally, as Mr. Tweedie first pointed out, my original form for the condition that a P.R.S. 
test should be a linear tcf^t was incorrect, and I am grateful to him and to Mr. l.indley for drawing 
my attention to this. I have taken the liberty of amending my original formula according to a 
suggestion made by Mr. Lindley and Mr. Armitage, who also drew attention to a mistake m 
formula (33), and to several minor errors. 



1946] 


27 


Symposium on Aiitocorrelation in Timf Serifs 
Contents 

Page 

On the Theoretical Specification and Sampling Properties of Aulocorrclated Tirne-.Scrics. Bv M. S. 

Barter IT ... 27 

Some Instruments for the Analysis of Time Senes and Their Application to Textile Research. By 

G. A. R. fo-STiR 42 

Random Processes in Problems of Air Warfare. By L. B. C. CunninciIiam and W. R. B. Hvnd ... 62 


On THE TiILORLTIUAL SpLC JUC ATION and SaMPLINCi PROPERnE.S OR AUIOC'ORRI LATl D 

Time-Serifs 

By M. S. Barter 1 1 

[Read bclorc the Risiarch Sfc iion oi ihi Royae Stafisik \l Soc ieiy, January 29ih. 1946, 

Dr. J. Wishart in the Chair.] 


CONIINIS 

1. Preliminary remarks. 

2. Standard error formuhe for the autocorrelations of discrete time-series. («) The Markoff 

process, (h) The general process. 

3. The specification of continuous time-series by their autocorrelation functions. 

4. Standard error formulie for the autocorrelations of continuous time-series, {a) The Markoff 

process. (6) 'The general process. 

5. Detailed specification of the second-order process. 

6. The estimation problem; theoretical information available on the unknown parameters. 

Application to Wolfet's sunspot numbers. 

7. Concluding remarks. 

8. References. 

1 . Preliminary remarks 

It was suggested at the R.S.S. meeting at which Mr. M. G. Kendalfs recent paper (9) on the 
analysis of time-senes was read that further discussion should be given to the problem of the 
arduous labour of calculating correlograms. While I understand that later speakers will describe 
new methods of calculating auto- or other serial correlations,* the purpose of the present sym- 
posium I interpret to be wider, for it is no use knowing how to calculate correlation coefficients 
if we do not know what they mean. Now', their interpretation depends on two interdependent 
things: the appropriateness of the theoretical scheme assumed and the magnitude of sampling 
lluctLiations. Kendall, following Yule (18), has stre.s.sed that for most time-series an autoregressive 
or autocorrelation scheme is more relevant than the assumption of exact harmonic oscillations 
detectable by periodogram analysis. lie has also pointed out the need for pooling theoretical 
results on this problem from the various fields of research w here it has arisen c.g., in economics, 
meteorology, gunnery or in the theory of electrical fluctuations. Nevertheless, as in certain 
respects I felt he stopped short in his presentation of autocorrelation theory, both generally and 
on the particular question of sampling errors, my purpose here will be twofold : 

(f) r shall amplify .some suggestions I made in the discussion on his pai^er about the 
sampling errors of a correlogram. The formukc I obtain arc lathcr crude, and in some cases 
not new, but they serve to indicate the order of magnitude of the errors. 

(ii) 1 shall try to link up the work of the English school " with work which it has rather 
neglected— namely, the important mathematical work developed of recent years on the auto- 
correlation theory of continuous time-series. I shall not attempt any comprehensive review, 
but some acquaintance with it seems essential to anyone researching in the theory of time- 

* Kendall has employed the term autocorrelation to denote a true value of which the observed value 
is the serial correlation. I shall use what I think to be a more logical terminology — viz., serial correlation 
for any correlation of one time-.serics with another, and autocorrelation for the particular .serial correlation 
of a series with itself. The standard notation of p for a true correlation coefficient, and r for the sample 
value, will be used. 



28 


[No. 1 


Bartlett— O/? the Theoretical Specification and 

series. Tn particular, I shall show that for the second-order oscillatory process considered 
by Yule and Kendall it leads to a more fundamental grasp of the dual problem of specification 
and sampling errors. 


2. Standard error for make for the autocorrelations of discrete time-series 
It will be convenient to obtain straight away the standard error formulte appropriate to discrete 
time-series of the general type discussed by Kendall (7, 8, 9) and others. Throughout this paper 
I shall consider only series for which expected values are functions merely of the intervals between 
the terms of the scries; this property characterizes stationary time-senes, in contrast with time- 
series which are “ evolving " and for which the calculation of serial correlations from successive 
observations would be rather meaningless. For the simplest linear process we have 

-V. 1 P-v, I- c, . (1) 

where , is independent of x,. For convenience, x, is assumed “ standardized -i.e„ the 
expected value E{x,] --- 0, and E\x^“} I. In the scheme (1) .r, ^ ^ depends only on a*,, a, _i, 

. . . through X,, the partial correlations with a,_, when a, is given, being zero. It is 

termed a Markoff process, owing to its relationship with fhc sequences or “ chains ” of variables 
studied by Markoff (cf. Khintchine, 10, p. 604). 

(a) The Markoff process. evaluate by straightforward algebra (cf. 2) the expressions 

var (var) ^{(1 :a/)^/i-} ~ 1, 

var (cov) EKIIa^v^, - p*", 

cov (var, cov) -- E\'S^xf){^x,x,^ ,)ln~) — p'. 


where the summation is over the n observations av (r 1 . . . //). It is proposed to give only 
the dominant first term in the expansion of the results in powers of 1 in ; to this order, measure- 
ment from the sample mean, or ‘‘ end effects ” (for .v" ' w), may be neglected. We obtain 


var (var) 
var (cov) 
cov (var, cov) 


(y ! 2)(l I r) 
nil - p^) ’ 

p'r,, , (r r 2J(J I p-n 

/)L ' \ - ^ J’ 


( 2 ) 


where y is the measure of non-normality £{ a *} 3, Since to the same order, if r, denotes the 

correlation with lag s obtained from the series, 

var (r.) var (cov) ^ p,- var (var) 2p, cov (var, cov), 


we obtain finally, independently of y. 


var (rj - 


I r(i ^ 

//L 


P*'^)(i 
I -P *^ 


p--) 



( 3 ) 


where p, - p\ When .v is large enough for p, to be small, this becomes 


var (/*,) 



(4) 


a result which is more easily obtained from the formula for var (cov). 

(b) The general process. - The above fonmilie arc given for reference, but they are of limited 
use in practice because we usually have to deal with time-series of more complicated character. 
For the generalization of (1) to a linear autoregressive scheme of any order, or in fact for any 
time- series, we can, of course. Write down formal sums which correspond to the above results for 
the Markoff process. The most general result we require is ( a , standardized) 

cov (cov,, cov, ^ d E{(i:A^V,+.)(i:A^, ^ - P,P, . „ 


which to the same order of approximation as before becomes 


1 

n 


N' 


j) = — ^ 


Cp.F,+< 


4 


Pr-*Pf I * 1 ( 




. ( 5 ) 



1946] 


29 


Sampling Properties of Autocorrelated Time-Series 

where ,, , is the first seminvariant involving all the four variables a,, 
and is a function of their intervals apart characterized by the suffices i\ s^ t. 
When the x, are normally distributed, formula (5) reduces to 


1 * 

- (PrP.. : I P.,-.P....h) 

ft V — - oc 

a useful result previously given by Daniell (5).* 

To obtain var (var); var (cov); cov (var, cov), wc put v / ~ 0; / ^ 0; s 
respectively in (5) or (6). Thus we obtain from (6): 


( 6 ) 


0 and t - s 


var (O -- ^ (p;- ^ p,. ,p,.., l- - 4p,p,p,,,,) .... (7) 

n V T? 

An important special case is when the true value p, has become small. The sampling errors of 
correlations are then, as noted for the Markoff process, approximately equivalent to those of the 
corresponding covariances. We obtain from (6) when p„, is negligible for w > .v, 

var(r,)~i 1’ p/, cov (/-„ r, , ~ i: p„p,t, (8) 

/; y _ jO n V -- - 

Formula (4) is the special case of var (/,) in (8) when p, - p*. It should be noticed that the general 
term ,, t to be averaged in the original expression for cov(cov) can contribute 

only one term p,.p,.m (where v u - r), if s is large enough for the dependence of .y„ on x, to 
be negligible for u > .J.v, and, under this condition, (8) will be true irrespective of the normality 
condition. They are then equivalent to Slutsky's formula for the autocorrelations of the r, series 
when p, is negligible (16, equation (5), p. 128).t 

However, we can go further than this for a wide class of time-series for which I am using the 
generic title of linear processes. These are characterized by the property that any term .v, is the 
linear superposition of the effects of a number of independent values of a random variable e, 
so that we may write : 

.T, - 2: .ir(.v (9) 

*' - - 

where the function g{s — w) 0 for w > s. We have had a simple example of such a linear 
process in the Markoff process (I). From (9) it follows at once from the properties of semin- 
variants that the simultaneous cumulant or seminvariant function 

K(t„ To, T 3, T4) -- log E{CXP /(t,A-, { To.Y, . F „ i Tj.Y, 

1 Kt(T,^^(>v) f ToA^(vv -- .y) f T3 .i,<>v r) j ~ r — .y — /)) . (10) 

w — yy 

where KfCx) is the cumulant function of s. Hence 
a^x) var (A) 

cov(.Y„ Y,, J - - s) j. 

.yte(vi’ — - r — s - /) 

JD 

and i: /f,, , - s)]{^,,g{u)g{u - s — t)} 

V ‘ y-j 

y(£)COV, . COV, , (12) 

From this last result (12) we readily obtain from the general formula {<^Hx) ^ 1) 

COV (r„ r, n) -- cov (cov,, cov, . ,) + p,p^ . var (var) - p, cov (var, cov, i ,) — p, + f cov (var, cov,) 

* I am indebted to Dr. H. E. Daniels for this reference, which has not been generally published, 
t In his paper Slutsky also refers to a previous paper “ On the standard error of the correlation 
coefficient in the case of homogeneous coherent chance series ” (in Russian), Transactions oj the Con- 
juncture Institute 2 (1929), 94i Unfortunately 1 have not been able to ocate this paper anywhere in this 
country. 



30 Bartlett -0/7 the Theoretical Specification and [No. 1, 

the result that for any linear process of the type (9) cov (/*„ r, <<) or var (/J are to the present 
order of approximation independent of the distribution of x, \ thus cov (r„ r, becomes 

cov (/•,, /*, i: (p,c, , - P,P, .. , - 4p,p, ,p,-/- 2p,p,,?, .. M — 2p,.,p,.p,.,) . (13) 

// j 

The rather curious results in (8) that the sampling properties of r, when p* has become small 
depend on the “ variance ” and covariances of p, in the correlogram seems sufficient to explain 
the reluctance of an observed correlogram to damp down to zero with p„ a point which worried 
Kendall when he came across it empirically, hrom (8) we see that the standard error of /*, will 
always be larger than l/\ and that the observed correlogram will preserve a misleading regu- 
larity even when p, is zero, the correlogram for neighbouring values of t\ being the “ correlogram ” 
of the true correlogram. 

For example, let us consider Kendall’s artificial s.jries (see 8, Table 3) 


w, 2 1 I v. I - 0-5 a^ , c, , (14) 

for which p,.. Mp^ , — 0-5p, (15) 


Kendall, giving values of t\ up to .v .30, obtained an / of 0-57 for s - 25 (for an n of 65), 
with /'.jfi 0*56 and /-.ji - 0-43, values which appeared unexpectedly high compared w'ith 

the true values p„ which have eficctively dropped to zero after s 10. Hut from the true values 
of p., most easily obtained in succession from (15), and recently given by Kendall (Table II in his 
Appendix to 18), we obtain var (rj '^244;//, and a “correlogram" of the correlogram as shown 
in Table f. 


1'able r 

Correlations nr, of the torrelations p. 


/ 1 


t 


t ' 


1 i 

I 0-8.^2 

1 

-01 1« 

13 

0-015 

2 ! 

1 0-434 

8 

t 0-022 

14 

- 0-027 

3 t 

) 0 ()()2 

9 

0-096 

15 

0-024 

4 1 

' 0-286 

10 

; 0-102 

16 

-0 012 

5 ' 

0-304 

n 

i 0-071 

17 

- 0 010 

6 t 

- 0-276 

12 

-t 0-019 

18 

. 0-(X)5 


If we consider /% for s 11 to 30, a is no longer small compared with the total number (65) 
of observations. It is therefore a rather better approximation for each s to considei // as the 
number of pairs of observations actually correlated (cf. Daniell, 5). This gives for the same range 
of .V an average value of var (rj of 0 053. The observed value was computed to be 0 083, with an 
effective number of degrees of freedom less than 20 because of formula (8). If we suppose that 
w'c can treat the terms /■, analogously to terms in the original time-series, but with the correlogram 
of Table I, for which (summed over all t) is 3 42, the elTective number of degrees of freedom 
will be more like 20 3-42 --6. A ratio 0 083 0*053 with 6 d.f. would not reach the 5 per cent, 
significance level. This adaptation of standard tests is admittedly rough, but a test based on the 
highest absolute value observed, 0*57 (for which n 40, and the effective si/e of sample of which 
It is the largest member again about 6), would yield a similar conclusion. Thus it may be con- 
cluded that the observed values of have come out a little high, but not significantly so. With 
correlograms we must evidently take care not to allow the tail to wag the dog ! 

For a Yule -Kendall process like (14), p, is of the form Xa,* \ (1 — *A)a/ and 2Ip/- theoretically 
summable; similarly for any more general linear process. But in practice it is often simpler to 
evaluate ^p/- directly from the numerical values as above. If the p. are not known, it is, however, 
meaningless to consider since var (/*,) for large n is of order I '/;, and the series cannot possibly 
converge. The only valid procedure would appear to be to fit a theoretical scheme containing 
one or two unknown parameters, such as the autoregressive scheme above, and obtain lip.- from 
the corresponding theoretical correlogram. It may be that a purely autoregressive analysis is 
sufficient, and this then has the advantage that the usual regression tests of significance, though 
not exactly applicable, will be approximately valid for n large (see 12). If, for example, a scheme 
like (14) were correct, the multiple regression of .v, , a on x, 1, • • • when .v, , i and .y, are 





1946 ] 


31 


Sampling Properties of Autocorrelated Time-Series 

held constant, should be zero. Unfortunately, as Yule and Kendall have pointed out, super- 
posed error complicates such an analysis. Further complications which arise when such an 
analysis is applied to continuous time-series are discussed in sections 5 and 6. 

In some of the preceding variance formuLe for autocorrelations, the efTective number of 
degrees of freedom has been reduced by the factor 1 Dp,". This result may l')e compared with 
Yule’s factor l /Dp„ for the variance of the mean (19), but, unlike the latter factor, it is essentially 
less than one. In the discussion on Kendall's paper (9), Champernownc appears to have sug- 
gested the use of the Yule factor, or at least its value of ( 1 - p) (1 : p) if the senes is a Markoff 
process, for testing significance in periodogram analysis. But in such an analysis we test the 
significance of a weighted mean, the weights being the appropriate harmonic coeflicients ; the 
factor will correspondingly be a function of these coeflicients. It may, it is true, be shown that 
for a Markoff' process the minimum value of the factor is (1 I?!) (I ‘ |p|)» which is equal to 

the Yule factor if p is positive. But it is also not clear what the interpretation of such a test 
would be. On the null hypothesis that there is no harmonic term we might identify the auto- 
correlations in this factor with those in the series (with appropriate precautions, as for Dp,“ above). 
But if there is a harmonic term. Dp, even for an infinite series would not converge (cf. Wold 21, 
section 17). Unless we adopt the laborious procedure of isolating the residuals for separate 
study, w'e are m danger of eliminating the bias of finding harmonic terms when they do not exist 
at the cost of never finding them when they do exist. And of course it is still important to study 
the oscillations intrinsic in the autocorrelated scries, which has corresponding to its correlogram 
a “ periodogram " of an entirely diff'erent character (see section of this paper) We must not 
throw away the baby with the bath-water ! 

3. The specification of continuous time-series hy their aufocoi relation functions 

It will have I'lecn apparent that the discrete nature of our observations in many economic and 
other lime-senes does not reflect any lack of continuity in the underlying scries, riuis theoreti- 
cally it should often prove more fundamental to eliminate this imposed artificiality. An unem- 
ployment index does not cease to exist between readings, nor does Yule’s pendulum * cease to 
swing. This new' conception of continuous random or stochastic proccs.ses is still unfamiliar to 
many people: I owe my personal reali/ation of its importance to Mr. J. R. Moyal. and to his 
own fundamental contributions to its development (13). The general theory mainly originated 
with the “ Russian school ” of Khintchinc, KolmogorofV, Slutsky and others, but has been 
increasingly studied and applied elsewhere, e.specially in America and in this country during 
the war. 

To see its relevance to the schemes considered in section 2, suppose first we generalize the 
autocorrelation for the Markoff process to the autocorrelation function 

pis) e^^\is 0) (16) 

where the time-lag .v is no longer necessarily an integral multiple of a unit interval of time. This 
autocorrelation function still preserves the unique property that the partial correlations of x^, 
with .r, _ T (T 0) for given .v, are zero. 

Mathematically, the properties of continuous processes can be studied through their auto- 
correlation functions. We must, of course, be careful to confine oui attention to permissible 
autocorrelation functions, and here Khintchine's theorem (10) on the spectrum " or harmonic 
analysis of the autocorrelation function is relevant. This states that a necessary and suflicient 
condition for p(v) to be the autocorrelation function of a continuous stationary stochastic process 
is that 

p(.v) I cos o)V r/F(<o) (17) 

where clF(o}) represents a “distribution function." For example, if p(.v) - c (v 0), and (since 
p( 5 ) is a symmetric function in 5), p(.v) (.v ' ()), we can obtain r/F(fi)) - f(oi)ilc) by the inverse 

relation 

f(oi) * I e cos w.s r/s - . . 

• 7T / nor r 

. ' oc 

* Yule (18); see section 5 of this paper. 



32 


[No. 1, 


Bartlett — On the Theoretical Specification and 

Since /(<o) is a valid distribution function in this case, so is pC?) a valid autocorrelation function. 
It is also known (e,g,^ Rice, 14, Part If, where the important mathematical work by Wiener, 20, on 
this aspect of the theory is referred to) that the function /(<o) gives the intensities for different “ fre- 
quencies ” o)/27r corresponding to a harmonic analysis of the original time-series jr,, so that there 
is a unique relation between the harmonic and correlation analyses of a time-series (for the 
corresponding relation for discrete series, see Wold loc. cit.). The above spectrum for the Markoff 
process gives a continuous band of frequencies t.>/27r, thus stressing the possible irrelevance of a 
standard periodogram analysis for such processes. It is only when the integrated function f’fco) 
is a step-function that discrete frequencies and corresponding periods in the classical sense 
exist. 

While the above theory enables us to study various permissible autocorrelation functions, it 
still appears to me important to set up if possible a more detailed theoretical mechanism to 
represent a time-series. If we can do this, we not only ensure automatically that the auto- 
correlation function is valid, but we find what it is, and perhaps obtain also further knowledge 
about the distributional properties of the process. For we have seen in the case of linear processes 
that the aut<xorrelation function does not exhaust the distributional properties unless the process 
is “ normal.” General consistency conditions for the higher product-moments are not, as far 
as 1 am aware, known. And further our postulated autocorrelation function, while at first sight 
reasonable, may turn out to be incorrect for the particular process we have in mind. For 
example, it is common to postulate the next simplest function to (16) for a continuous process as 

p(s) ^ cos X.9, (5 > r>) (18) 

in the case when oscillations are present, but it is shown in section 5 that this function, while a 
permissible autocorrelation function, is not correct without modification for the particular second- 
order process set up. 

Coming back, then, to a justification of (16), we could follow the typical argument used in 
physical applications like the problem of Brownian motion, and assume that (I) still holds with 
both interval and random increment made infinitesimal. This, however, is too specialized an 
assumption for the present purpose, for while it leads to (16), it also leads to a normal distribution 
for a:, (cf., for example, 3). It may be generalized by supposing that while increments may still 
be finite, their occurrence is random in time; this assumption (which is related to the homogeneous 
random process discussed by Cramer, 4, Ch. 7) still leads to an autocorrelation function of the 
same form without restricting the nature of the distribution of x,. This will be more conveniently 
demonstrated in detail for the second-order process. But first of all I want to record the standard 
error formuhe corresponding to those given in section 2. 


4. Standard error for mulce for the autocorrelations of continuous time-series 

If we assume that we have a continuous record of we may evaluate the results corresponding 
to those in (2) simply by replacing sums by integrals. We have, for example, 

var (var) e( V'}. 

(a) The Markoff process. — By such methods we obtain for the Markoff process when T is 
large, 

var (var) ~ ^ 

- var(cov) ^ (, 9 ) 

cov (var, cov) ~p,p.-^2 
where s is no longer necessarily an integer. Hence 


or when p, is small. 

It may be askwl “ What is 


var (r.) ~ [I - p.«(2jpi + l)]/(x3" (20) 

var(r.)~l/pLr • (21) 


the relevance of such results when in practice we probably have to 



1946] Sampling Properties of Autocorrelated Time^Sertes 33 

take a finite set of observations?” The point is that they do indicate the intrinsic sampling 
accuracy of a series of length T in contrast with that in the arbitrary number of observations we 
happen to have made. We can always generate a discrete series as a set of observations made 
at regular intervals on the continuous series (in general the converse is not true — e,g„ if p < 0 in 
(1); cf. Wold loc. cit.). For the Markoff process the increment , then becomes the sum of 
increments in the time-interval (5, s -|- 1). We now have the relation p =- c whence 
\lT ~ — /ilogp. Thus from formulae (4) and (21), as p, ->0, we obtain a relative efficiency in 
estimating p, from the discrete set n of observations, of 

‘ d l p")log I/p 

The corresponding ratio from (3) and (20) depends on p„ but in the case .v 1, we have 

F - Irril 1 iogi/p‘0 

(l-p^)logl/p 

The values of Eq and £, are plotted against p- in Fig. 1. 



0-2 04 06 0-8 

„2 


Fig. 1. — Efficiency ratios Eq (p, ~ 0) and F, (j = 1) for estimating p, from observations made at regular 
intervals, plotted against p'**, where p is the true correlation pi and the process is a continuous Markoff 
one. 


(b) The general process . — For the general process it will be sufficient to note the integral 
" corresponding to (5) ; this is 

cov (cov., cov,..,) ~ y (p,p,.< + P.-,P,+,+< + .... (24) 

from which other formulae may be deduced. For a continuous linear process analogous to the 
discrete linear process (9) I shall write formally (cf the next section) 

where represents the total random increment up to time t corresponding to in (9). 
Then analogously to (10) we have 

K(ti, Tj, Tj, T 4) -= I^^Kj(rig(u) -h T^g(u - /) *f T3g(w - v) 4 T^g(u — V - s - t))du . (25) 

SUPP. VOL. VIII. NO. 1 


c 



34 Bartlett — On the Theoretical Specification and [No. 1, 

where K/(t) is the rate of increase of the cumufant function of the total random increment 
From (25) we have formulae analogous to (1 1) and (12), so that cov (r„ J is again independent, 
to our order of approximation, of ,, /. Incidentally we note from (25) that Xt cannot be normal 
unless the variable / represented in K/(T)is normal — i.e., the increments are intrinsically normal 
or else (roughly speaking) the individual increments are sufficiently small and numerous for 
their sum, in a small interval of time for which g(t) is constant, to have become normal. 

The results for the mean and variance of jc, contained in (25) are known as Campbell s 
theorem; the generalization to other seminvariants has been given by Rice (14, sections 1.5 and 
3.11), equation (25) above representing a further extension required for the theoretical develop- 
ments of this paper. 

Omitting the term „ we may from the theory of Fourier transforms write (24) in the 
alternative form 

cov (cov„ cov, ^ I /*‘'(<*>){c^"‘ "i- ' 'Ww .... (24a) 


where /(co) was defined in section 3. Specific formulae for the second-order linear process arc 
recorded later. 

5. Detailed specification of the second-order process 

Coming now to a detailed examination of the continuous second-order process, T shall 
re-consider the problem which Yule used in his pioneering paper (18) as a basis for the second- 
order difference equation of the type (14). He imagined a swinging pendulum subject to bom- 
bardment by boys equipped with peashooters. This problem in another guise is of practical 
importance, for equally we may think of a sensitive instrument disturbed by impulses of a 
Brownian motion character {e.g., a galvanometer with suspended mirror whose torsional oscilla- 
tions are disturbed by impacts from gas molecules).* As in the case of the Markoff process, this 
form of Brownian motion has usually been studied with the assumption that even in a small 
interval of time the disturbances are infinitesimal but numerous; the distribution of Xt then 
becomes normal. But again I shall not impose this restriction here, but leave the distribution of 
Xt unspecified in general. This allows an exact representation of Yule’s problem of the swinging 
pendulum subject to any type of instantaneous random impulse ; and of similar or more com- 
plicated processes. Yule's problem 1 shall, without stopping here to rigorize the argument 
completely, denote by the equation 

Xt + (!LXi \ \ix It (26) 

where dots denote differentiation with respect to the time /. In this equation /, represents a 
random impulse function which changes x, discontinuously, but x, may be regarded as formally 
defined by (26) in terms of /„ which is an improper function possessing a proper integral 
The solution of (26) involving /, is 







~~ 1^3 



(27) 


where (ij and (Zg are the roots of x^ I ax f p = 0. Hence, if -■= 0, £{/„/„} - 0 when 
M =4= and a‘“(/) is a finite quantity representing the rate of increase of variance of the integrated 
impulse f/fiv, we obtain 

}•<•>»' ™ 

whence a'‘*(x) - var (x) - ff‘^/)/2ap and 


— " cos"0 I 


. (29) 


* My attention has kindly been drawn by both Mr. J. E. Moyal and Mr. P. A. Moran to references 
(6 and 1) on this application. 

t This comment suffices to define Xt and xt in terms of (26) and its solution. In other problems 
may be a random quantity whose existence is established only by a wider definition of differentiation (due 
to Slutsky, 16; sec also Moyal, 13). On Slutsky’s definition, Xt is continuous but still not strictly 
differentiable. 



1946] 


35 


Sampling Properties of Autocorrelated Time-Series 


where X® ~ p — Ja^, tan 0 — ^a/VP. Equation (28) shows that if the series has been generated 
a long time ago, so that the effect of initial conditions has become negligible, the series stabilizes 
at a*(x) given by o2(/)/2ap (cf. Kendall, 8, equation (12), for the corresponding result for a discrete 
process). The formula (30) for p, should be contrasted with (18), and also with Kendall’s result 
for a discrete process (8, equation (13)). It isi I find, not new, having been given in the case of 
Brownian oscillations by Zernike (22, equation (6), p. 518).* 

The corresponding frequency spectrum is obtained by inverting the function p, as 


/(w) -- 


] 

TT (<0*^ — 


aP 

P)* -1“ 


. . (30) 


(cf. 6, equation 2209). It may equivalently be written 

Jf , _ 1 ^ - co“ ) _ 1 

7c\(<o - X)* 4 - (to 4 - X)-^ 1 ia- “[(co - X)‘^ ] "}a‘^][(to X)^^ 4 

where the first two terms correspond to the spectrum of the function (18). For small damping 
(a small), and /(to) considered for positive to, only the first term of all is large as co -> X, and the 
difference betv'een the spectra of (29) and (18) becomes small. 

The function p, is a solution of the differential equation 


P/' 4- ap; -f pp, - 0 (31) 

which can be obtained directly from (26) by multiplying the equation at time / I jr by Xt, and 
averaging, being uncorrelated with Xt. 

From (29) are obtained explicit formulae for any of the sampling error expressions given in 
section 4. One of the most useful is 


var(0-(aM- p)/apr (32) 

when p, is small. The following expression for cov (cov„ cov,h), apart from the term involving 
y(/), is also given for reference. It was obtained from (24) and checked from (24a). 


where 


T0{t)e^^ 


cov (cov„ cov,+,) -- 0(r) h 0(2^^ I' 0 

^ (2P 

8X»p 


cos X/ -f - ^ 4X“~" 


In the “ aperiodic ” case X ^ 0, we obtain the comparatively simple result from (24a), 


cov (cov,, cov, f ,) 


4 

3 



Tvir 


(34) 


From the exact solution (27) for we may investigate whether any exact difference equation for 
X, exists in place of the differential equation (26). We obtain 

4- * f hx^ - [/]/ ' (35) 

where a = — + t'''**), b - *^ '"*>*, and the symbol on the right-hand side denotes a definite 

integral from r to / 4 - 2// involving viz.. 


UV 


.1 ♦ . 

h '■ 


3* r) _ f 2A-f-) 

1^1 ” (^2 


jdo~bj‘ 


j j I 

"I 1^1 -M-2 j 


Thus the Yule-Kendall relation 

^ 2A 4- aXt { f- hxt = £( f 2* 


(36) 


where is a random increment arising subsequent to x^^, is not exact for the process 
originally considered by Yule. There must, of course, be some discrete process corresponding 
to (26), yielding the correct p, solution (29), but this is defined by (35). Multiplying this equation 
by x,-„ (5> 0), and averaging, we obtain 


Pi+2A 4- ^P,+» 4- 6p, — 0 (37) 

an equation which is satisfied by the solution (29). Although (37) is identical with the difference 

* Zernike goes on to consider the efficiency of estimation of the mean, a simpler problem than those 
considered here, but the relevant one in his case, and one, as noted in section 2, recently discussed by 
Yule (19). 



36 Bartlett — On the Theoretical Specification and [No. i, 

equation for p, given by Kendall, its solution is different because it docs not hold for ^ — K 

but only for j > 0, owing to the dependent of on In fact 

[p-J) 

where [p.*] denotes the analytic continuation of p* for negative K and is unequal to the value 
Pa P-A except in the limiting case when a == 0. Thus in general an autoregression analysis of 
the type (36) will yield values of a and b inappropriate for the process (35) ; this is so even in the 
limit when h becomes infinitesimaL If X/27c is the “ frequency,” I find the limiting value of X 
estimated from (36) as V(^ — Ja*) instead of V(P — ia^). The difference will be negligible as a 
becomes smaJl, but if a* -> 4(3 gives rise to a spurious frequency when the true frequency is zero. 
Correspondingly the “ period ” or “ wavelength ” 2nl\ which in general, at least for small /r, 
will be underestimated,* will remain finite even when the true period is infinite. 



Fig. 2. — Values of A estimated from the Yule-Kendall finite difference equation plotted against the 
observational interval, when the process is a continuous second-order one with true frequency 
A/27r - 0 (and a = 2). 


To check this, 1 considered the solution of (26) in this aperiodic case, for which Ja^. 
The solution is 

“ v)dr, 

P, - (1 + (j> 0) (39) 

If for such an autocorrelation, yve attempted to estimate the frequency by means of (36), we 
should obtain the spurious frequency shown in Fig. 2, where X is plotted as a function of the 
value a/f, where h is the interval between observations. For definiteness a was taken to be 2, so 
that the limiting value of X is Jv/5 = 0-745. 

The invalidity of (36) raises the question whether any other finite difference equations (apart 

* This bias appears related to that obtained by Spencer Smith (17), who considered continuous periodic 
time-series subject to independent disturbances in amplitude, phase and trend. I am doubtful, however, 
of the possibiliw of analysing a disturbed oscillatory series into components corresponding to independent 
disturbances of this kind, when for natural disturbances of the type considered here the effects are 
necessarily related. From his concluding remarks Spencer Smith appears to recognize these limitations of 
his method. 




1946] Sampling Properties of Autocorrelated Time-Series 


37 


from the relation (35)) exist for the process (27). We have seen that for this process Xt exists as 
well as X, ; in fact, from (27) we have 


whence 


/-« I / 

P(i„ jc,+.) -- p. 

p(x„ x,^ .) -- - p(jc„ in ,) -= - is t!!; 


. . (40) 


. . (41) 


We obtain the simultaneous pair of first-order difference equations in x, and .r„ 


. _ £{£,+^1 • _ „ 
Xu A -r. -r, 

E{x,.aX,}^ E{x,,aX,}, 

AT , I * r — 


[c.V^ 

[c J.' ' 


(42) 


where 


£{i>} £{;r‘} 

£ii<7,-i,'*i - p.w{i (;»+>■■)’}■ 

atovi"*!-. ■(«){! -^(S)’- p4' 

£■{[0.^2]."*} - 


As A->0, these two equations reduce respectively to (26) and to x, — x,)/h, showing 

that for h small enough it is sufficient to consider the formal equation (26). 


6. The estimation problem ; theoretical information available on the unknown parameters 

Formally the estimation problem for the second-order process is simple, since it is linear in 
the unknowns a and {i. We obtain the least-square estimates a^, where 

+ ?>,x;^)dt - f^iXiX^ }- a,xV“ h ?»,XtX^)dt ^ 0 

or since f^XtxdtlT-^O as T increases, we have 
*'0 

OL,'^ — I ^ XiX^dt if^ xi^dt ] 

” h (43) 

We know further that these least-squares estimates, which would have minimum standard errors 
in orthodox regression analysis, will have asymptotically minimum errors in the present case, 
irrespective of the distribution of /, (cf. 12 and section 2 of the present paper), given by 

var (a,) ^ (y'V)lJ^'Xt^dt Iol/T 

var (W ^ nV)lfJx '^dt - 2ap/r 

In some problems, where a continuous track of the time-series is available (in the case of torsional 
Brownian oscillations continuous records from an oscillating mirror system are reproduced in 6, 
Fig. 79) it is possible that direct optical or electrical devices could be invented to measure the 
quantities occurring in (43), where it should be noted that owing to the existence of Jt* being only 

fT J CT 

formal, I x^Xidt i^ to be interpreted as Lt yf — Xt^)dt. In other cases, the formul® (44) 

•^0 ^ ^ 0 

are a gauge by which the efficiency of any actual method of estimation used can be investigated. 

The principles involved are best illustrated first for the simpler Markoff process, since we 
have seen that the use of discrete scries for the second-order process raises special difficulties. 




[No. 1, 


38 


Bartlett — On the Theoretical Specification and 


For the Markoff process we have corresponding to the equation (26) for the second-order process, 
the formal equation 

X, i {iXf ---- It (45) 


whence x, '] 

rj'^(x) ~='aV)l2ii I (46) 

Ps - ' (v > 0) J 

The least-squares estimate of (jl is 

J^XtX,dtljJx;^dt (47) 

where var (tJi,)«- a2(/)/y^V^/^ 2[i/r (48) 


In (47) Xt is to be replaced by (Xt — x,)/h but if we do not proceed to the limit h -- 0, we are 
obviously estimating jx by means of the autocorrelation p*. Thus for finite h, bias may be intro- 



Fio. 3. — Efficiency ratios: (i) and (ii) EiE^, plotted against p* ip — p*), for estimating the unknown 
parameter in a continuous Markoff process from the autocorrelation obtained (i) from a continuous 
record, (ii) from observations made at intervals h. 


duced if we keep to formula (47), and it will be more consistent to use the formula for p* directly — 

/.e., 

/i(x. - - log /•* (49) 


For this estimate, if r* were still obtained from a continuous record as f^XtX.^ ,,dt/f^x,^dt, we 
have var ((xj var (rjpf) or from (20), for any distribution of /„ 


var ((X,) - [1 - p**(2^{x f \)]l{iTp,^ (50) 

The ratio of (48) to (50) can be written as 


^ i losM/p^ ^ 
logp* ~ I I- 1/p* 


. (51) 


which is plotted in Fig. 3. If further r* were obtained from a discrete set of observations with 
interval /r, we should have var (r*) given by (3) instead of by (21), and the overall efficiency is 
reduced to E^Et, where E^ was shown in Fig. 1. EjE* is also plotted in Fig. 3; it will be seen 
that the fall in efficiency as h increases is fairly rapid. 




1946] 


39 


Sampling Properties of Autocorrelated Time-Series 


Thus in fitting the Markoff process (45) with finite h we revert from the regression estimate 
to a consistent estimate obtained from the first available autocorrelation in the correlogram. 
The remainder of the correlogram would be used in conjunction with the known magnitude of 
its sampling errors to consider the adequacy of the fit. 

Let us try to consider now the appropriate procedure for the second-order process. We 
first of all make the inevitable substitution (i,f * — Xt)lh for x,. If the nature of the observations 
prohibits the direct use of we further substitute {x^ < ^ — x,)lk for x^, where for the moment 
/: ( < A) is not assumed equal to /;. The limiting estimate of fi presents no difficulty, the regression 
estimate becoming 

r,W^ (52) 

this being valid for small k as can be seen from the expansion 

P* = 1 H . Ak> 0). 

But for the regression estimate becomes 

^ I /-A-* - 2(/;a H - 1) 

2//(l - rj ■ 

whereas for both h and k small the corresponding expression in p is equal to a(l — ^kjh). Thus 
(53) unless corrected would lead to an under-estimate even when //->0 unless klh is small, the 
srds factor multiplying a being reminiscent of the limiting bias in X arising from the use of the 
finite difference equation (36). In practice if r, is available only for integral values of .v, we should 
use the functional relations of p, to a and fi directly to give consistent estimates of a and p, 
analogously to the procedure for the Markoff process, but the above point is important because 
it throws some doubt on the efficiency of taking the first two autocorrelations r^ and r.^. All we 
know from (53) is that this limiting formula is not valid unless klh is small. The direct use of 
the functional relations for p, must therefoie for h small correspond to a correction for bias which 
in the case of k -- h is a multiplying factor of U. Unless the variance of the expression in (53) 
also alters as k increases to h the efficiency of the estimate would not be greater than 4/9. 

A complete answer to this problem must await a detailed investigation of the asymptotic 
efficiency of the functional use of /*] and r^y using their variance and covariance relations (obtain- 
able from (33)). So far l .have carried this out only in the special but rather important case of 
small damping (a small). The formula for cov (/*„ r, ,,), for which the general form in terms of 
cov (cov, cov), etc., was given in section 2, becomes in this case 

cov (/;, r, , ,) .v(.v f- t) cos >7 ^ t sin >7 -f {Is | f) sin X (2.9 I t) , sin X.9 sin Xt^ t) 

a~ ' 2 4X ^X** * 


From the relations for var (aj and var ([i^) e.g,. 


var (aj ^ 


(S)’ 


var (r , , ,) 




var (r.) - 2 


iifM 


cov (/•„ r , , ,) 


/<P. iP, <^P.u\ 

Vc|i f-a f'a r'B / 


1 established after somewhat tedious algebra that the formulas (44) still hold when a and are 
estimated from r, and r, , , for any small values of s and t ; thus, at least in the case of a small, 
the use of the first two autocorrelations ri and r.^ for estimating a and is justifiable. 

The importance of the form of the autocorrelation function as 5 -> 0 is worth stressing even 
if superposed error is present, for while the latter would vitiate the above analysis unless allowed for, 
its effect on the correlation function in this region may itself help to indicate its character. Thus 
a random and entirely independent error superposed on each observation depresses the correlation 
function a finite amount at s near zero, this being the efficient estimate of its magnitude approxi- 
mated to in practice by considering observational differences; a superposed Markoff process, 
which might also be a form of error for a continuous process, will lead to a function with a finite 
slope at s 0. The further terms in the expansion of p, depend, as we have seen, on the values 
of p and a (in that order) in the second-order oscillatory process. The loss of efficiency if an 
error effect as well as the constants of the main process have to be estimated from values of p 
only available for integral values of s will evidently be considerable. 



[No. 1 


40 Bartlett — On the Theoretical Specification and 

Application to Wolfer'^s sunspot numbers , — In view of Yule’s development of tlw finite differ- 
ence equation with specific regard to the analysis of the Wolfer sunspot numbers, it seems 
desirable (without attempting here any complete discussion of these figures) to illustrate the pre^nt 
theory on the same data. Following Yule, I have assumed first that the series of annual numbers 
quoted might be represented by a second-order process. The two unknown constants are now 
estimated from the first two correlations r^ and r^, this being fairly rapidly done by an 
polatory method. The result is given in Table If. While I have stressed that the large standard 
errors for r, when p, is small allow comparatively large departures from expectation, the estimated 
damping factor appears excessive, being even greater than in Yule’s analysis. Yule suggested 
that the data were effected by random observational errors, and while this is not very apparent 
from the annual averages he depicts, it is much more evident in the original quarterly figures 
(see 11, Fig. 1). For comparison I include also therefore an analysis for Yule’s smoothed figures. 
Of course the use of averages and graduated figures is highly dangerous in analysing time-series 
for periods, and it would seem more satisfactory to use the original quarterly figures with appro- 
priate inclusion in the estimation equations of the effect of any observational error. However, 
the analysis for the smoothed figures is of some interest. The estimated period still comes out a 
little lower than that usually accepted, but the discrepancy appears trivial compared with the 
estimated period’s standard error of the order of ii per cent. The asymptotic formula used, 
corresponding to loo per cent, efficiency of estimation (which is certainly not reached), was 

var (fi, - ia,*) a(2p 4 W)IT (55) 

this formula follows from the variances of the efficient estimates of a and p, and from their zero 
correlation. 

Table II 


Autocorrelations of lVolfer\s sunspot numbers 


s 

Annual averages 

Graduated averages 

Observed 

(Yule) 

Theoretical 

(Yule) 

Theoretical 
(present theory) 

1 

‘ Observed i 
(Yule) 

Theoretical 

(Yule) 

Theoretical 
(present theory) 

1 

0-8112 

(0-8112) 

(0-8112) 

0-8407 

(0-8407) 

(0-8407) 

2 

0-4340 

(0-4340) 

(0-4341) 

0-4714 

(0-4714) 

(0-4713) 

3 

0-0316 

0-0513 

0-0855 

0-0470 

0-0397 

0-0605 

4 

-0-2645 

-0-2154 

-01 290 

-0-2641 

-0-3181 

-0-2562 

5 

-0-4041 

-0-3228 

-01990 

-0-4043 

-0-5139 

-0-4091 


a, - 0-3160, - 0-6158 


a, = 0-1593, - 0-5811 



0(0.) ~ 0 085, a03.)~ 0 059 


0 -(a,) ~ 0 060, a(/3.) ~ 0 036 



a(r,)-^ 0-1 29, (p, 0). 


a(r.)~ 0152, (p, =-. 0). 



Estimated period 10-2 yrs. ± 1-7 

Estimated period 10-8 yrs. 4 : 1*2 


Cf. estimated period (Yule) 10-6 yrs. 

Cf. estimated period (Yule) 1 1*2 yrs. 


The present analysis does not refute Yule’s original analysis, the bias noted in section 5 being 
apparently negligible owing to the small damping for this series, and the estimated period actually 
less than Yule’s for both ungraduated and graduated data. The apparent adequate fit (for the 
graduated series) does not, of course, prove that the theoretical model is correct, but it does place 
the onus of proof on those who claim more complicated schemes or more accurate estimates to 
provide sampling errors and tests in support of their claims. 

The alternative suggestion by Yule that the sunspot series might represent the square of 
the amplitude jc* of an oscillating series rather than Xt itself is one which seems to merit further 
investigation, but the following notes indicate the difficulty of handling such a theory. 

(i) The autocorrelation function p,(x*) is no longer independent of the nature of the 
distribution of /|. 

(ii) The autocorrelation function for various a may have a discontinuity at a — 0, being 
given by cos 2Xs when a ~ 0, and by p,*(x) for a 4= 0 for /, normal. 

(iii) The expected value of r/jf*) as T increases converges more and more slowly to p,(;c®) 
as a approaches 0. 





1946] Sampling Properties of Autocorrelated Time-Series 41 

These comments show that no simple jc,* theory, such as assuming the disturbances /, to be 
normal, will fit the observed autocorrelations, which have both positive and negative values. 

7. Concluding remarks 

Before I make way for other speakers, let me anticipate some obvious limitations of the above 
theory. Some reflect the rudimentary state of our knowledge, as, for example, the use of 
standard errors of serial correlation coefficients which approach unity. Here I would note that 
no simple transformation is available to convert to a more suitable scale, as for ordinary correla- 
tion coefficients, but that these particular formulae have been used only in conjunction with the 
asymptotic regression theory, and will tend to have the same approximate validity as the latter. 
The extent to which this “ large-sample ” theory can be used is clearly a matter for further 
investigation; it will depend very much on the nature of the time-series, for we have seen 
that the extent to which we can estimate expected values accurately depends on the serial relations 
between the observations. For example, with the second-order process the less frequent the 
disturbances the longer the series we have to take before our data can be representative. This 
reflects theoretical boundaries to our statistical analysis (this is not peculiar to time-series) in 
that we cannot always hope to distinguish empirically between, say, an exact harmonic series and 
one with very infrequent disturbances. The importance of specifying when possible the theo- 
retical form of the process has l>een stressed in this paper, and an attempt made to develop a 
reasonably logical interplay of theoretical structure and corresponding sampling theory for some 
typical time-series. Thus I hope I have tied up a few of the loose ends in this straggling subject; 
1 am conscious pf many more still left to trip over. 


8. References 

1. Barnes, R. B., and Silverman, S. “ Brownian motion as a natural limit to all measuring processes,*’ 

Reviews of Modern Physics^ 6 (1934), 162. 

2. Bartlett, M. S. “ Some aspects of the time-correlation problem in regard to tests of significance,” J. 

Roy. Star. Soc., 98 (1935), 536. 

3. Chandrasekhar, S. ” Stochastic problems in physics and astronomy,” Reviews of Modern Phvsics, 15 

" (1943), 1. 

4. Cramer, H. Random variables and prohabUity distributions 1937). 

5. Daniell, P. J. ” Sampling errors of the lag-covariance of fluctuating lime-series ” (unpublished note). 

6. Fowler, R. H, Statistical mechanics (Cambridge, 2nd ed,, 1936). 

7. Kendall, M. G. ” Oscillatory movements in English agriculture,” J. Roy. Stat. Soc., 100 (1944), 91. 

8. . ” On autoregressive time-series,” Biometrika, 33 (1944), 105. 

9. . “ On the analysis of oscillatory time-series,” J. Roy. Stat. Soc., 108 (1945), 93. 

10. Khintchinc, A. ” Korrelationstheorie der stationaren stochastischen Prozesse,” Math. Annalen, 109 

(1933^), 604. 

11. Larmor, J., and Yamaga, N. ” On permanent periodicity in sunspots,” Proc. Roy. Soc.y A93 (1917), 

493. 

12. Mann, H. B., and Wald, A. “ On the statistical treatment of linear stochastic ditference equations,” 

Econometrica, 11 (1943), 173. 

13. Moyal, J. E. ” Theory of random functions ” (paper not yet published). 

14. Rice, S. O. ” Mathematical analysis of random noise,” Bell System Tech J., 23 (1944), 282; and 24 

(1945), 46. 

15. Slutsky, E. “ Sur les fonctions ^ventuelles continues, int^grables et d6rivubles dans Ic sens stochas- 

tique,” Comptes RenduSy 187 (1928), 878. 

16 . . “ The summation of random causes as the source of cyclic processes,” Econometricay 5 (1937), 

105. 

17. Spencer Smith, J. L. ” The specification of disturbed periodic time-series of the type of Wolfer’s 

sunspot numbers,” J. Roy. Stat. Soc.y 107 (1944), 231. 

18. Yule, G. U. ” On a method of investigating periodicities in disturbed series, with special reference to 

Wolfer’s sunspot numbers,” Phil. Trans. y A226 (1927), 267. 

19 . . “ On a method of studying time-series based on their internal correlations ” (with ” Note on 

Mr. Yule’s paper ” by M. G. Kendall), J. Roy. Stat. Soc.y 108 (1945), 208. 

20. Wiener, N. ” Generalized harmonic analysis,” Acta MathematicOy 55 (1930), 117. 

21. Wold, H. A study in the analysis of stationary time-series (Uppsala, 1938). 

22. Zernike, F. ” Die Brownsche Grenze fur Beobachtungsreihen,” Zeits.f. Physik, 79 (1932), 516. 


C2 



42 


Foster — Some Instruments for the Analysis of 


[No. 1, 


Some Instruments for the Analysis of Time Series and their Application to 

Textile Rfjsearch 

By G. A. R. Foster 

(British Cotton Industry Research Association) 

I. Introduction 

The analysis of oscillatory time series has received considerable attention from statisticians 
during the last few years. The work described in this paper was completed before these more 
recent developments. It will, nevertheless, I think, be interesting as an illustration of the use of 
the periodogram and correlogram methods as tools in experimental physical research, and will 
serve to call attention to some of the questions, still outstanding, which the physicist would like 
to see answered. 

The large numbers of periodogram and correlogram analyses required for this work would 
not have been possible without instruments to carry out the analyses rapidly and automatically. 
As the instruments may be useful to others working on time scries, I shall commence with a brief 
account of each instrument, sufficiently complete to enable its advantages and disadvantages to 
be appreciated. For working details reference should be made to the original papers. 


II. Instruments for the Analysis of Time Series 


1 . The Grating Periodograph ^ 

This is an optical method of performing the periodogram analysis of a series of observations. 
The observations to be analysed are plotted and the area under the curve is made white on a 
black background, as shown in Fig. 6. A curve on thin paper may either be used directly in the 
periodograph or it may be photographed on to a process plate to obtain greater transparency. 

The curve is set up in a vertical plane along the line Fig. 1, with the time axis horizontal, 
and illuminated from behind by diffused light. Parallel to the curve is a grating consisting of a 
series of parallel equidistant vertical slits, and beyond the grating is a vertical ground-glass screen. 
The arrangement is shown in perspective in Fig. 2, in which, however, the screen is shown inclined. 
To return to Fig. I, if the curve is uniformly illuminated and its ordinates are small compared 
with its distance from the ground-glass, the illumination at B on the screen through the slit D is 
proportional to the ordinate of the curve at Ti, that through the slit Z)a to the ordinate at Tg, and 
so on. The total illumination at B is therefore proportional to the sum of a number of equi- 
distant ordinates of the curve. The spacing of these ordinates is given by 


<1 


PR 

QR 


. DxD^ 


V 


( 1 ) 


where v is the spacing of the grating. 

If q corresponds to a peak on the periodogram, the screen is crossed by a series of alternate 
light and dark fringes, but for other values of q the illumination is nearly uniform or the fringes 
are faint. The formation of the fringes is seen more clearly in another way; if a period of 
length TiTa is present, and its maxima are Tj, T*, Tg, then there will be a bright band at B, whereas 
at a neighbouring point C, which receives light from the minima, the illumination is less, while 
at B' it is again a maximum. The positions of the peaks of the periodogram can thus be observed 
by measuring the positions of the grating and screen which made the visibility of the fringes a 
maximum, and calculating the length of the period from equation (1). For this purpose the 
curve and screen are fixed on an optical bench, and the grating mounted on a sliding stand 
between them. It is an advantage to incline the screen as shown in Fig. 2, for it then cuts through 
the planes corresponding to the different values of q, and each set of fringes appears as an appar- 
ently narrow corrugated band stretching horizontally across the screen. It is then only necessary 
to move the grating until the band of fringes is bisected by a horizontal line on the screen. 



1946] Time Series and their Application to Textile Research 43 

In this form the periodograph carries out the first part of the periodogram analysis — that is, 
the addition of equally spaced ordinates. Periods whose lengths are subniultiples of p will there- 
fore also produce fringes for the same setting of the grating. These higher-order fringes can be 




eliminated by the use of a harmonic grating— i.f., a grating for which the light transmitted is 
proportional to (1 4 - a cos IkxIs). Such gratings can be prepared by photographing sine curves 
in the same form as Fig. 6, with a short focus cylindrical lens placed in front of the camera lens. 
With a harmonic grating the periodograph performs the complete periodogram analysis, except, 




[No. 1, 


44 Foster — Some Instruments for the Analysis of 

of course, that the amplitudes of the periods can only be assessed qualitatively from the visibility 
of the fringes. These amplitudes can, however, either be calculated from tlje original data with 
the saving of the labour of calculation on all the trial periods which do not correspond to peaks 
on the periodogram, or they can be measured on the correlation periodograph described in the 
next section. 

It will be' seen from Fig. 1 that, if the periodograph analysis is to include the whole of the 
curve, the width {d) of the grating must be greater than / 4 (1 4- «/v), where / is the length of the 
curve. When the grating is narrower than this, the analysis extends only over a length d{\ 4- ujv) 
of the curve. By screening the sides of the grating and sliding the curve lengthways to bring 
each fringe in turn on to a vertical line on the (vertical) screen, the mean phase of the period in 
successive overlapping portions of the curve can be determined. This corresponds to Schuster’s 
method of secondary analysis; it was used in the work described below in the extreme form 
when all the grating is screened except the central line. 

Since the length of the trial period can be varied continuously, the accuracy with which the 



Correlation Periodograph. 

periodograph determines the position of a peak is probably slightly greater than that of the 
periodograph. As with the periodograph, the width of the band of fringes corresponding to a 
period of length p corresponds to a change in 1 /p of 1 //, where / is the length of the curve. Experi- 
ment shows that the middle of the band can be fixed to within about a twentieth of its width, so 
that the error in 1 /p is 1 /20 /. 

The dimensions of the instrument are as follows. The curve and screen are fixed 50 cm. 
apart, and the curve can be any size up to 20 x 5 cm. Three gratings, 10x15 cm. of spacings 
4, 1*5 and 075 mm. containing 37, 100 and 200 lines, respectively, are used. 

2. The Correlation Periodograph 

In the correlation periodograph, due to Martindale,® the grating of the Grating Periodograph 
is replaced by a replica of the curve on a reduced scale. The arrangement is shown diagram- 
matically in Fig. 3. If the curve A is illuminated by diffused light, fans of rays through corre- 
sponding ordinates of the two curves, such as those at and A/j, and Afj, all intersect on a 
line through P parallel to the ordinates and distant h from A, where bl{b — o) = C is the ratio 
of the scale of curve A to that of curve B. If <f>{t) the ordinate of curve A at /, the total amount 
of light falling on P is therefore proportional to 




45 


1946] Time Series and their Application to Textile Research 

If the curve B is displaced a distance the amount of light is proportional to 

I <l>{t) . <f>(t + s)dt, 

Jo 

so that by displacing the curve B known amounts the correlogram is obtained. 

In the actual instrument a narrow vertical slit long enough to collect all the light passing 
through the highest ordinates of the curves A and B is placed at P. The light passing through 
the slit is concentrated on to a photo-cell by means of a cylindrical lens and a large condensing 
lens. The photoelectric current is amplified and measured on a galvanometer. The measure- 
ments give accurate relative values of (1 — c,); to obtain absolute values it is necessary to calibrate 
the instrument by replacing the curves in turn by rectangular apertures of known dimensions. 
The present arrangement of the instrument does not, however, permit this to be done with 
sufficient accuracy, because of fluctuations in the voltage on the lamps illuminating the first curve. 
During the determination of the correlogram these fluctuations are almost cancelled out by a 
compensating beam of light from the lamps which fall on to a second photo-cell. This cell and 
that measuring the correlogram are in the opposite arms of a bridge circuit, so that the deflections 
of the galvanometer indicate the amount of out of balance of the bridge. Unfortunately the 
compensating beam cannot be used during the calibration. To obtain absolute values would either 
require a controlled voltage on the lamps or the use of a null method similar to that adopted 
by Gray * in his photo-electric integraph. The instrument should then be sufficiently accurate 
for all practical purposes — in fact, the maximum error in the correlation coefficient should be 
about 0 * 01 . As absolute values were not essential in the work on cotton slivers and rovings, 
these refinements of the Correlation Periodograph have not yet been carried out. 

It will be noticed that, if the curve B, Fig. 3, is replaced by a sine curve, the periodograph can 
be used for determining the amplitudes of the periods in a periodogram analysis. The length of 
the trial period can be altered by using sine curves of different wave-lengths and by adjusting the 
distance a between the curves. The most rapid procedure would be to determine the positions 
of the peaks on the periodogram by means of the Grating Periodograph and to use the Correlation 
Periodograph to measure their amplitudes. 

Periodogram and correlogram analysis by these optical methods is very rapid once the curves 
have been prepared. The time taken to plot a complete correlogram depends on the number of 
serial correlations it is required to observe ; but since each correlation can be observed easily in 
5 seconds, the time, including calibration and calculation of absolute values, is not likely to 
exceed an hour.* When the original data are in numerical form the plotting and blackening of 
the curves is the most laborious part of the process, but, even so, the time taken is negligible 
compared with that needed for arithmetical computation. A fairly simple machine would, 
however, enable numerical data to be photographically plotted on to process plates. In the 
study of the variations of thickness of cotton slivers and rovings the curves are obtained directly 
in the required form on a photographic recorder. 

3. The Planimeter Integrator * 

An alternative instrument for calculating the correlogram is shown in Fig. 4. It makes use 
of an integrating wheel rolling on the surface of a rotating disc as shown in the plan. Fig. 4. The 
integrating wheel, X, is a planimeter wheel (4 cm. diam.) with a scale and vernier reading to 1/1000 
revolution, and is held in a frame attached to the rod /?, which is free to slide in a direction parallel 
to the axis of rotation of X, while X rests on the surface of the disc /). If D rotates at uniform 
speed and the displacement of X from the centre of D at time / is ;c, A' will integrate xdt, A 
second horizontal disc, C, is supported by a vertical spindle passing through bearings fixed to the 
frame of the lower wheel, and rests on the upper side of the rim of A" at a fixed distance from the 
centre of C, being carried along with X as the rod R is displaced. The rotation of C is thus 
proportional to that of A. A second planimeter wheel, K, rests on the upper surface of C, and 
is fixed relatively to D in Such a position that it is over the centre of the upper disc when X is 
over the centre of the lower. Upon displacement of /?, Y is thus the same distance, jc, from the 

* This applies to the instrument in its present form, in which the deflections of the galvanometer or the 
setting of the compensating beam have to be read and plotted. It could, however, easily be made to draw 
the correlogram directly, and the time would then be reduced to, probably, about a quarter of an hour. 



46 


[No. 1, 


Foster — Some Instruments for the Analysis of 

centre of C as is from the centre of A and its rotation is therefore proportional to It thus 
integrates x^dt. To obtain correlation coefficients, the rod R is pivoted to the mid point of a 
lever, not shown in Fig. 4, the ends of which are displaced distances equal to the observations 
The displacement of R is then x^ and the upper wheel Y integrates (Xt + x^ »-,)* 

r- X* -f 4- 

For some purposes the instrument may be more suitable than the optical penodograpn, since 
it can be used in several ways. Pointers attached to the ends of the pivoted lever may either 
move over scales or may follow a graphical ‘record which is driven at a speed proportional to that 
of the disc Z>. For discrete observations a ratchet wheel is fixed to D so that it can ^ turned 
one tooth for each observation. The ends of the lever could also be driven cither directly or 
through a servo mechanism from a testing instrument. Thus the variations in diameter of a 



FROA/r ELEVATION SIDE ELEVATION 



PLAN 


Fig. 4. 


The Planimetcr Integrator. 


cotton yarn are usually measured by passing it between a steel plate and a light steel shoe which 
rides on the surface of the yarn, and measuring the vertical movements of the shoe. Two such 
shoes spaced s apart might through a servo mechanism be used to control the integrator and to 
so obtain the correlogram directly by repeated passages of the yarn through the instrument. In 
this particular example it happens to be more convenient to use the optical periodograph, but 
this is not necessarily true of all measurements. When dealing with discrete observations, each 
pair can be dealt with easily in one second. 


HI. The Analysis of Drafting Waves by the Periodogram Method ^ 

1 . The Drafting of Cotton 

In order to make the succeeding discussion intelligible, I shall first of all describe briefly the 
process through which cotton passes before it is spun into yarn. The earlier processes are 




1946] 


47 


Time Series and their Application to Textile Research 

designed to remove dirt and to separate the fibres from one another, and the cotton finally emerges 
from them in the form of a card sliver, which is an untwisted strand or rope about i inch in 
diameter. It is very soft, and although the fibres composing it are not straight, they are so loosely 
entangled that the sliver is easily stretched by drawing the fibres apart so that the sliver can be 
drawn out into a fine strand and finally twisted into yam. The drawing or drafting is usually 
done in gradual stages on a number of machines, each of which consists of sets of rollers, which 
perform the drafting operation. The process is illustrated in Fig. 5. The sliver passes through 
two sets of rollers, the front pair, /? 2 , of which revolves at a higher speed than the back pair, /?i, 
and so draws the fibres over one another to make the sliver longer and finer. The ratio of the 
speeds of the rollers is called the draft, and is usually anything from 2 up to about lo with ordinary 
rollers. The distance apart, //, of the roller nips is called the roller setting, and must obviously 
be adjusted to suit the length of the fibres. Sifice the fibres in one variety of cotton vary in length 
from a small fraction of an inch up to i or 2 inches, and since the rollers cannot be set closer than 
the maximum length of the fibres, there must be a number of fibres which are for a time out of the 
control of both sets of rollers. The laws of motion of these floating " fibres form one of the 
fundamental problems of textile research. 



Roller Setting — h. 

Fio. 5. 

Drafting Rollers. 


If after leaving the back rollers the short fibres always continue to move with the speed of 
these rollers until their front ends are gripped by the front rollers, the motion is always the same, 
and variations in fibre distribution in the entering sliver will be merely repeated in the drafted 
sliver, but will be spread out over greater lengths in proportion to the draft. But some of the 
floating fibres may be dragged forward by the fibres already gripped by the front rollers, and so 
form a thick place in the drafted sliver and a thin place between the rollers. When this thin place 
begins to reach the front rollers the number of floating fibres being dragged forward will decrease 
because there are now fewer fibres to drag them forv^'ard. Consequently a thick place will be 
formed between the rollers, which will in its turn reach the front rollers, and so the whole process 
will repeat itself. We may therefore expect the drafting to cause an oscillatory* variation in the 
thickness of the drafted sliver. This oscillation is called the drafting wave, and it is the applica- 
tion to it of periodogram and correlogram analysis, which I wish to discuss. 

* For convenience, the word “ oscillatory ” will be used in this paper to include strictly periodic variations 
and also “ quasi ” or “ almost ” periodic variations, which vary in amplitude and phase. The term 
“ strictly periodic ” or where appropriate “ simple harmonic ” will be used for variations which repeat 
exactly. 


48 


Foster — Some Instruments for the Analysis of 


[No. 1, 


2. Periodogram Analysis of the Drafting Wave 

Fig. 6 shows photographic records of the variations in thickness of drafted slivers. The i&xo 
of these records is about half the total height of each illustration below the present base line. 
Both slivers were prepared by applying a draft of 3 to a card sliver, but in Fig. 6h the rollers were 
set wider than the length of the longest fibres so as to increase the number of floating fibres. 
Oscillations can be clearly seen in both records; especially in Fig. 66, part of which bears 
a striking resemblance to the sun-spot series. These would. I think, now be recognized at 
once as disturbed oscillations, and the correlogram would be applied to their analysis. The 
work on the drafting wave, although only recently published openly, was, however, commenced 
before the appearance of Yule’s paper on the sun-spot series,*^ and the periodogram method was 
accordingly employed. Preliminary results were ‘very complicated, numbers of periods were 
discovered in every drafted sliver examined, much more than one for each drafting process. This 
threw considerable doubt on the simple explanation that the periods were due to the motion of 
the floating fibres; it could not in any case be concluded that, because the floating fibres might be 
expected to draft in an oscillatory manner, an observed oscillation must necessarily be caused by 
their motion. To prove this required reasonably accurate means of measuring the wave-length and 
amplitude of the drafting wave, so that the results oi experiments made with different shafts, 
roller settings and with different cottons could be compared. The Grating Periodograph was 
therefore constructed, and a thorough examination of a few slivers which had been drafted once 
only was undertaken to see if the periodogram could provide such a means of measurement. 



Records of Cotton Slivers, (a) Test D 1, D 3, 6 — 3-81 cm. 280 cm. of sliver. 

(6) Test D 2, D “ 3, 6 - 5 08 cm. 170 cm. of sliver. 

The whole curve Fig. 6(a) Test D.l representing 280 cm. of sliver was first of all examined. 
Twenty-nine periods were observed, ranging in length from 3*6 to 38 cm. About half these were 
of low amplitude, and might perhaps have been insignificant. It seemed obvious, however, that 
Schuster’s test of significance could not be applied in this case, and as an alternative test the 
curve was divided into four equal sections each representing 70 cm. of sliver, with the object of 
finding out which periods persisted throughout the whole curve. The wave-numbers of the periods 
observed are given in Table I, together with their amplitudes classified as high (h), moderate, and 
low (/), as judged by the appearance of the fringes in the periodograph. It should be noticed 
first of all that the resolving power of the periodograph would be insufficient to separate in a 
length of 70 cm. all the periods that had been observed in the full length of curve, so that if the 
periods persisted through the whole curve many of them should have appeared as broad bands 
in the analysis of the reduced lengths. No such effect was observed ; nearly all the bands were 
of normal width for a single period. It will be seen from Table I that out of the 23 observed 
periods only 4 occur in all four sections, and that, if this were taken as a test of significance, 3 
periods of rather low amplitude (Nos. 24, 26 and 28) would be accepted, while some of high 
amplitude (Nos. 15 to 18) would be rejected. But the curve was also examined in three different 
consecutive sections of 70 cm., the first beginning at the middle of section 1 of the table and 
extending to the middle of section 2 and so on. It was then seen that some of the periods that 
are present in all four sections of Table I were absent from these intermediate sections. 

When the wave-numbers of the periods as abscissae are plotted against a series of whole 
numbers as ordinates, the points, with two exceptions, lie fairly close to a straight line of slope 




Fio. 12. 


[To face p. 48. 








Fig. 14. 



Fig. 16. 



1946 ] Time Series and their Application to Textile Research 49 

84*9 cm., which in this test passes through the origin, showing that the periods are harmonies bf 
a fundamental period of 84*9 cm. Table I shows that the agreement of the calculated and 
measured wave-numbers is fairly good. The differences are, however, greater than the error of 
the periodograph, which for this length of curve is not greater than 0*0014 cm. ^ But this error 
is only that in setting the instrument, and takes no account of the random variations in the curve, 
which no doubt have some effect on the measured wave numbers. 

It has already been mentioned that when the whole length of this test was examined, 29 
periods were observed. As shown in Fig. 7 , they lie on a line of slope 120 cm., again with a few 



Fig. 7. 

Periodo^ram Analysis of Drafting Wave. Test D 1 . 


exceptions of low amplitude, while when 100 cm. of curve were tested the slope was 91 cm. 
Similar results were obtained on several other slivers prepared with different drafts and roller 
settings. They are summarized in Table II, which includes altogether 23 periodogram analyses. 
With increasing length of sliver more periods were generally observed and the slope of the line 
was increased. The fourth column of the table gives the mean deviation of individual measured 
periods from the line, and so indicates the closeness of fit of the periods to the linear relation. 
Since no point can be more than 0*5 unit from the line, one would expect for a random distribu- 
tion a mean deviation of 0*25 unit. The actual mean deviations are usually much less than this. 





50 


Foster — Some Instruments for the Analysis of 


[No. I, 


Table I 

Test D.l. Wave Numbers in cmr^ 


^ . i Harmonics of 

Number of section i 84*9 cm. 


1 

2 


1 3 

1 


1 4 


Wave 

number 

Number 



0-361 

/ 

i 0-364 

— 





0-365 

31 

0-352 

/ 

0-348 

/ 





— 

— 

0-354 

30 

0-328 

— 1 

0-327 

1 

0-332 

/ 

0-330 

1 

0-330 

28 

0-309 

/ 

0-307 



0-304 

__ 

0-307 

/ 

0-307 

26 

0-295 

/ 









— 

— 

0 295 

25 

0-279 



0-277 



0-281 

/ i 

0-280 

/ 

0-283 

24 

0-265 

— 

— 

— 

0-263 


0-265 

/ 

0-260 

22 

0-250 

— 

— 

1 



— 

— 

— 

0-248 

21 

— 

— 

0-240 

/ 

0-243 

— 

— 

— 

— 

— 

0-234 

— 

— 







0-236 

— 

0-236 

20 

— 

— 

— 

— 





0-220 

— 

0-224 

19 

0-217 



0-214 

^ i 





— 

— 

0-213 

18 

0-200 

h 

— 

1 

— 



— 

— 

0-201 

17 

— 

— 

0-192 

h 

0-189 

— 

— 

— 

0-189 

16 

0-180 

h 

— 

— 

— 

— 

0-177 

h 

0-177 

15 

— 

— 

0-172 

h 

1 — 



— 

— 

— 

— 

— 

— 

— 

— 

0-164 

— 

— 

— 

0-164 

14 

0-144 

h 

0-141 

h 

0-146 

h 

0-144 

h 

0-142 

12 

— 

— 

0-120 

— 

— 

— 

0 118 

— 

0-118 

10 

— 

— 

0-093 

— 

0-088 



0-093 

/ 

0-094 

8 

0-079 

— 

— 

— 

— 

— 

0-078 

/ 

0-082 

7 

— 

— 

0-071 

— 





0-069 

— 

0-071 

6 

0-049 

— 

0-053 

— 

— 

— 

0-050 


0-048 

4 


This linear relation is the only definite fact which emerges from all these pcriodogram analyses; 
the particular periods present vary in an apparently haphazard manner from one portion of the 
sliver to another, and the slope of the line cannot be regarded as characteristic of the drafting 
wave. There is thus no basis of comparison of one test with another, and the method is therefore 
practically useless for the purpose of measuring the effects of various drafting conditions on the 
amplitude and wave length of the drafting wave. 


Table II 


Test no. 

Length of sliver 
examined, cm. 

Slope, cm. 

i 

1 Mean 

1 deviation 

Wave- 

length, 

cm. 

Wave- 

number, 

cm.“‘ 

D1 

*280 

120-1 

0-16 

5-92 

0-169 

— 

100 

91-0 

0-09 




— 

70 (4 Sections) 

84-9 

— 

— 



D2 

166 

99-5 j 

0-17 

6-62 

0-151 

— 

100 

85-5 

0-15 1 




— 

83 (2 Sections) 1 

45-0 I 

0-08 

— ' 



— 

41-5 (4 Sections) 1 

1 45-6 1 

0-22 

— 



D3 

81 (2 Sections) 

51-2 1 

0-12 i 

7-58 

0-132 

D4 

81 (2 Sections) 

! 45-1 i 

0-15 j 

5-66 

1 0-176 

FI 

61 (4 Sections) 

58-9 1 

1 0-22 1 

— 

1 — 

— 

149 

1 105-7 

'0-17 

7-6 1 

! 0-131 


3. Pcriodogram of a Disturbed Series 

In Fig. 6 the drafting wave appears to be a simple period 5-7 cm. long, which varies con- 
siderably in amplitude. The next step was therefore to consider what kind of results a wave of 
this type might be expected to give on analysis by the pcriodogram method. 



1946] Time Series and their Application to Text He Research 51 

If we assume in the first place that the variations in amplitude are themselves periodic, they 
can be represented by a Fourier serjes : — 

flo 4- cos {Inqx 4 El) 4- ^2 cos {Atiqx -i £ 2 ) 4- . . . 

The only condition that the a's must satisfy is that the high order ones should be small, for 
these are the amplitudes of periodic variations in the amplitude shorter than the drafting wave 
itself, and if such amplitudes were not small, we should not describe the wave as one which simply 
varied in amplitude. 

If the wave-number of the drafting wjive is k, its equation is then 

y = 1^0 4- a I cos (2nqx + Ej) + . . . } cos Inkx 

- a^^ cos 2r:kx + cos {2r.(k -f q)x ]- ed f ^2 COS {2T.{k 4- 2q)x 1 £ 2 ) 4 . . .] 

+ i[«i cos {2‘K{k — q)x — Ed 4- cii cos {2r.{k — 2q)x e^ 4 . . .] . (2) 

The simple harmonic components have wave-numbers : — 

. . . {k - 3q), {k - 2q), {k — q), A, (A f q), (A 4- 2q), (A 4 3q) . , . 
and amplitudes : — 

. . . hh, ia2, lui, 2 ^ 1 , loi, loii • • • 

When plotted against a series of whole numbers, the wave-numbers therefore lie on a straight 
line, whose slope is the wave-number of the amplitude period. The amplitudes are symmetrical 
about the middle of the range of periods, and tend to ^ greatest in the middle and least at the 
ends. 

For a finite length of curve the amplitude variations of the disturbed period resemble a periodic 
variation of fundamental length comparable with the length of the curve, and the disturbed 
period would therefore give the kind of periodograms which have been described in the last 
section. The appearance and disappearance of many of the components in dift'erent portions of 
the curve are simply accounted for by the changes in form of the amplitude variations. It will 
be noticed, however, that the differences between the observed wave-numbers and that of the 
disturbed period represent the periodogram of the amplitude variation in the length of curve 
examined, and, if the amplitude variations are purely random, it seems therefore rather remark- 
able that the observed periods do so often recur in different sections of the curve, and that their 
wave-numbers do not depart more from the linear relationship. A more rigorous mathematical 
treatment seems to be needed. 

It is still, however, necessary to compare the amplitudes of the simple periods with equation (2). 
These were calculated for all the periods observed in the periodograph for three of the above 
tests. The complete periodograms are given in Fig. 8, in which the loo-cm. length for Test D1 
is the same as that for Fig. 7. The periodograms show a general tendency for the amplitudes to 
be greatest in the middle of the range. But the amplitudes are not symmetrical. This means 
that on reversing the calculation of equation (2) the simple components to be combined are 
unequal in amplitude, and the resultant then fluctuates in phase as well as in amplitude. In all 
other tests examined it could be clearly seen on the screen of the periodograph that the grouping 
of the periods resembled Fig. 8. The periodogram was thus useful in providing strong evidence 
that the drafting wave was a simple disturbed oscillation fluctuating in amplitude and phase, but 
it was useless for obtaining measurements of the wave-length and amplitude of the wave. 

It should be added, however, that in some cotton yarns and slivers mechanical defects in the 
machinery introduce variations which are strictly periodic. Provided that their amplitudes are 
not too small, these periods produce in the periodograph bands of fringes, which stand out among 
the weaker fringes due to the drafting wave, so that the wave-length is easily measured and the 
cause of the period identified. 

The inevitable splitting up of a disturbed period into an often large number of simple com- 
ponents is a property of periodogram analysis, which, though it is the consequence of well- 
known principles, has not, as far as I am aware, received sufficient emphasis in connection with 
the analysis of economic and meteorological time series. The simple components have constant 
phase and amplitude throughout the whole portion of the series analysed. If the mechanism 
generating the series is such as to produce harmonic oscillations of constant phase and amplitude. 



AMPLITUDE, PER CENT 


52 


Foster — Some Instruments for the Analysis of 


[No. 1, 





Fig. 8. 

Periodograms of the Drafting Wave in Cotton Slivers. Tests Dl and D2, 100 cm. of sliver. 

Test FI, 149 cm. of sliver. 




53 


1946] Time Series and their Application to Textile Research 

then the peaks of the periodogram will indicate the periods of the mechanism ; but if for any reason 
the phase or amplitude of the oscillation varies or changes, then the periodogram will split it up 
into two or more simple components, whose periods may have little direct relation to the natural 
periods of the mechanism. When the changes of amplitude and phase are frequent and acci- 
dental, the periodogram will indicate large numbers of periods. As the length of the portion 
of the series analysed increases, the total variation in amplitude becomes more complex, the 
number of simple components increases and their amplitudes decrease. From the point of view 
that they are the results of accidental causes and that their amplitudes tend to decrease with 
increasing size of sample, these components might perhaps be regarded as being statistically 
insignificant. On the other hand, they are certainly significant in the sense that they are the 
result of a real tendency of the system to produce periodic oscillations. With a series of this type 
we may, as with the drafting wave, only arrive at a useful physical interpretation by putting the 
components together again. 

In a paper read to this Society about a year ago, Kendall ^ made a comparison of the period- 
ogram and correlogram methods as applied to an economic series and to some artificial auto- 
regressive series, and concluded that where there is any possibility that a series is of the auto- 
regressive type, the periodogram may not only be worthless, but extremely dangerous in suggest- 
ing periods of no reality. The conclusion drawn from the measurements on the drafting wave is 
similar, but not quite so drastic. The literal interpretation of the observed periods would cer- 
tainly be extremely misleading, but the periodogram was useful in showing that we were almost 
certainly dealing with a single disturbed period. 


4. Secondary Analysis of the Drafting Wave 

An attempt was next made to measure the drafting wave by a limiting form of Schuster’s 
method of Secondary Analysis. The whole of the grating on the Grating Periodograph was 
screened, except the central line. The grating was set to correspond with the approximate wave- 
length obtained by counting peaks on the curve, and the curve was then moved along to bring 
each bright fringe on the vertical ground-glass screen in turn on to a vertical line. A scale on 
the curve holder gave the phases of the maxima on the curve. The process is equivalent to 
fitting a sine curve one period at a time to the sliver curve. The maxima were numbered in order, 
and their distances from the beginning of the curve plotted against their numbers. The result 
of a typical series is shown in the upper part of Fig. 9, in which the circles represent the phases of 
the maxima as measured. They fall on a series of parallel straight lines. Since it is known that 
the amplitude is very variable, it is reasonable to assume that it is sometimes so small that some 
of the maxima are unobservable. On this assumption the mean slope of the lines measures the 
wave-length. The upper line in Fig. 9 was drawn with this mean slope, and the points brought 
close to it by renumbering the maxima to allow for possible unobserved ones. (The beginning 
of the upper line is plotted one unit higher to bring it clear of the short lines.) When the wave- 
length obtained from the lines differed appreciably from that to which the grating had originally 
been set, the grating was re-set to the new wave-length and the work repeated. 

Wave-lengths and wave-numbers obtained by this method are given in the last two columns 
of Table H. The wave-numbers may be compared with the periodograms in Fig. 8. 

From these measurements the amplitude can be approximately estimated by marking, off the 
positions of the maxima on the curve, taking the minima to be half-way between them, and 
subtracting the minimum from the maximum ordinates. 

This rather crude method of measurement was sufficiently accurate and reliable to allow 
slivers drafted under different conditions to be compared and a series of experiments to be carried 
out which demonstrated that the drafting wave is in fact caused by the oscillatory motion of the 
shorter fibres in the cotton. Nevertheless the method is not very satisfactory. While it cannot 
be denied that when the amplitude of the oscillation is very variable it may sometimes be so low that 
a peak cannot be observed, a lot of personal judgment is occasionally required, and it is by no 
means certain that the gaps between the lines in Fig. 9, for example, are not due to real changes 
of phase rather than to very low amplitudes. It is also only when the average amplitude of the 
wave is fairly high that the method can be used, and this puts rather severe limitations on the 
work. It might possibly be a useful method for oscillations less disturbed than the drafting waye. 



54 


[No. 1, 


Foster — Some Instruments for the Analysis of 

IV. CORRELOGRAM ANALYSIS OF THE DRAFTING WAVE 

Before discussing the application of the correlogram to the drafting wave, it is worth while to 
describe more fully the nature of the series with which we have to deal. If we ignore the varia- 
tions in mass per unit length along each fibre and from fibre to fibre — variations which for the 
slivers used in this work contribute only a very small fraction to the total variance — the thickness 
or mass per unit length of a sliver at a given point is proportional to the number of fibres crossing 
a section of the sliver at that point. It follows at once that measurements of thickness at sections 
separated by distances less than the maximum length of fibre are correlated, because some fibres 



are common to both sections. Spencer-Smith and Todd have calculated this correlation for 
flax slivers; for cotton slivers their formula reduces to: — 


^ = 



Number of fi b res crossing bo th sec tions 
Number of fibres crossing one section 


where /(/)d/ is the frequency of fibres having lengths between / and / + dl, and is the maximum 
fibre length. There is a little uncertainty in applying this to actual slivers, as the fibres in the 
slivers are not as straight and parallel as they are when their lengths are being measured during 
the determination of /(/). 

Spencer-Smith and Todd and Martindale have shown that on perfect machinery with 



19461 


55 


Time Series and their Application to Textile Research 

perfect control over the fibres during drafting, the most uniform sliver that we can hope to make 
is one in which the fibres are distributed at random along the length of the sliver, in which the 
number of fibres crossing a section of the sliver, or the fibre number as it is usually called, follows 
a Poisson distribution with variance equal to the mean fibre number. In addition to this cause 
of variance, we have, in cotton slivers, the more important causes of variance, the drafting wave 
and variations due to mechanical defects in the machinery. The following figures give an idea 
of the relative importance of these variances in the rovings used for the correlogram analysis. 
The total coefficient of variation varied from lo to 20 per cent., according to the drafts and 
roller settings employed, the coefficient of variation for the random arrangement of the fibres 



Fig. 10, 

Correlograms of Drafting Wave. D — 8*26, h -- 4*05 cm. Twist in entering roving 0*48 turns per in. 
Sample lengths («) 340 cm. {h) 1700 cm. (r) 3400 cm. 

from about 2 to 4 per cent., and that due to the variation in mass of the fibres was about i per 
cent. Mechanical defects were eliminated as far as possible by tuning up the machine on which 
the rovings were prepared. 

The application of the correlogram to textile slivers has previously been discussed by Spencer- 
Smith and Todd in the paper already referred to, in which they treat it as a measure of the 
departure of actual flax slivers from the ideal random sliver. The deviations of their measured 
serial correlations from those expected for a random sliver were all significant, and suggested 
<hat some kind of periodicity is imposed on the random structure of the sliver. 

Here the point of view is somewhat different. Further progress in the researches on the 
motion of the floating fibres demanded methods of measuring the wave-length and amplitude of 
the drafting wave, so that these and the laws governing their changes with draft, roller setting 



[No. 1, 


56 Foster — Some Instruments for the Analysis of 

and variety of cotton could be discovered and compared with theoretical studies of the fibre 
motion. 

The correlogram analysis was therefore undertaken primarily as a better method of measuring 
the wave-length than that of secondary periodogram analysis, but we also hoped that something 
might be revealed of the type of motion of the fibres and of the nature and magnitude of the 
disturbances.® 

From a statistical point of view it would be interesting to have correlograms of the slivers on 
which the periodogram analyses were performed; but the work had to be co-ordinated with 
other researches, and was accordingly carried out on rovings. These are similar to sliver, except 
that they are finer, and that the fibres composing them have been more or less straightened and 
parallelized by previous drafting. Normally rovings have also just sufficient twist inserted to 
allow them to be handled without damage. 

A typical set of correlograms is shown in Fig. 10. In the description of the Correlation 
Periodogram, upon which the analyses were made, it was pointed out that in its present form the 
instrument gives only relative values of the correlation coefficients. In order to plot all the 
correlograms on the same scale, it has been assumed that the mean correlation for large values of 
s is zero. Each curve is the mean of those for positive and negative values of s. 



Fig. II. 

Correlograms of the Drafting Wave. D = 8-26, h 203 cm. Sample length 3400 cm. Twist in entering 
roving {a) 0*48 turns per in. {h) 1*4 turns per in. 


Fig. 10 (a) is for a sample of roving 340 cm. long (the length for the first curve in the periodo- 
graph; the total length used was this plus twice the maximum value of .v). The correlogram is 
rapidly damped during the first period, but thereafter fluctuates between about iho-i. The 
curves for the longer samples Fig. 10(Z?) and (c) show, however, that this fluctuation is due merely 
to the finite size of the sample. The small peaks on the tail of the periodogram are therefore not 
statistically significant, becau^ these decrease as the size of the sample increases. The correlation 
at the first minimum, the first maximum, and perhaps also the second minimum, either increases 
or remains almost constant. 

Fig. 11(a) is the correlogram for a similar roving drafted with a closer roller setting; the 
damping is greater. This may either be due to greater damping of the drafting wave itself or to 
its lower amplitude relative to random variations in the roving. In Fig. 11(6) the roving entering 
the rollers had about three times the normal twist. The damping is now much less, and the 
oscillations persist over at least three periods before they become insignificant. The effect of 
twist on the wave is well shown in Fig. 12, which shows the variations in thickness of part of the 
rovings used for the correlograms of Figs. 10 and 11(6). 

These and many similar correlograms showed that with normally twisted rovings the first 
minimum was always significant, and usually also the first maximum. It was therefore concluded 
that there was a definite tendency for the motion of the floating fibres to be periodic, but that 
either the damping is so heavy or the disturbances are so great that this tendency persists only for 




1946] 


57 


Time Series and their Application to Textile Research 

a half to one and a half periods. The smaller damping of the correlograms for more highly twisted 
rovings suggested that the disturbances are due to inequalities in cohesion or openness of the 
cotton, which are less important when the fibres are more closely bound together by the twist. 
The twist also, by increasing the frictional forces between the fibres, probably increases the 
tendency towards a steady oscillation. 

When, however, the correlograms are used to measure the wave-length of the drafting wave, 
various difficulties are encountered. At first we measured the mean distances apart of the peaks 
on the tails of correlograms such as Fig. lO(tj), regarding the tail as a smoothed and averaged 
version of the original record. This, however, seemed rather unsound, as these peaks are 
statistically insignificant — a fact which was rather forcibly brought home to us when we realized 
that we were delilwately using rather small samples in order to preserve good peaks in the tail. 
The width of the central peak was also measured at definite values of the correlation coefficient, 
and attempts were made to fit equations to it. Such methods are, however, bound to be inaccurate 
because the shape of the central peak is affected by variations in the curve other than the drafting 
wave, and also by the correlations at short distances due to the fibres which stretch across both 
sections of the roving. This correlation is zero for values of s greater than the maximum 



la) 



ib) 

Fig. 12. 

Records of Cotton Rovings. 500 cm. of roving, (a) Corresponding to Fig. 10, 

{b) corresponding to Fig. 11b. 

fibre length, which was 27 cm. for the cotton used for the correlograms illustrated. Finally 
the distance of the first maximum from the ordinate s 0 was taken as a measure of the 
wave-length; it was nearly always well defined, though sometimes, as, for example. Fig. ll(fl), 
it was no higher than the peaks on the tail, and on some few rovings it was so ill defined as to 
render accurate location Impossible. It is doubtful whether in such cases the wave-length has 
any definite meaning. Twice the value of s at the first minimum, which was always clearly defined, 
was also tried, but as it gave values on the average about 10 per cent, higher than the first 
maximum, we concluded that it was affected by some of the same factors which affect the shape 
of the central peak. ' . 

One feature of the measurements is the large variation of the wave-length from sample to 
sample of the same roving. On a number of different rovings the range of nine measurements 
of wave-length made on 340-cm. samples, each including about 50 periods, was anything from 
15 to 30 per cent. As these variations in the position of the first maximum were accompanied 
by corresponding changes in the width of the initial peak and in the spacing of the peaks on the 
tails, we thought they might be due to real changes in the period with which the drafting tends 
to oscillate, caused possibly by imperfect admixture of the cotton or variations in twist; but we 
were able to rule out all such causes, and concluded finally that there was no evidence for changes 
in the fundamental period of oscillation, and that the variations in measured wave-length were 






58 Foster — Some Instruments for the Analysis of [No. U 

due partly to the error in measuring the position of the first maximum on the correlograms and 
partly to the variations in phase caused by the disturbances. 

Recently Kendall ’ has shown that for the auto-correlated series generated by : — 

, 2 + n + 1 ( 3 ) 

of which the complementary function is 

u^ — /?' {A cos 0/ + sin 0/) (4) 

the equation of the correlogram for an infinite series is 

^ sin 0 ' ' 

in which 

tan 0 = ~ p = \'b and tan i// -= J ^ ^ tan 0. 

There is, therefore, a phase difference between the first maximum and the peak at s ^ 0, and 
consequently the position of the first maximum cannot be used to obtain the length of the period 
unless the damping is known. Kendall therefore suggested that the period should be estimated 
from the peaks or upcrosses beyond the first maximum. This is not, however, very satisfactory 
when these peaks are not statistically significant, and since, for the reasons already explained, 
it does not seem possible to estimate the damping from the shape of the initial significant part 
of the correlogram, we are left with no sound method of measuring the length of the drafting wave. 

In order to get an idea of the magnitude of this phase difference, 1 have calculated approxi- 
mately the position of the first peak T., on the correlogram for three artificially generated series 
(Kendall **) according to equation (5) and compared it with the distance apart of the peaks, Ti, 
on the undisturbed motion (4) and with the fundamental period, r^. The results are given in 
Table 111. 

Table HI 


Scries no. 

a 

b 

To 

Tx i 

r. 

1 T,n 

1 

Tp 

1 


1 0-5 

9-25 1 

8-56 I 

8-17 

8-3 

8-2 

2 

-1-2 

0-4 

19-53 

16-55 ; 

16-11 

12-7 

12-7 

3 

1 

-M 

1 

0-8 

6-92 

1 

6-79 1 

6-64 

7-0 

1 

7-9 

1 


is the observed position of the first maximum on the correlogram, and the period measured 
from the peaks other than the main one at s 0. The appearance of the correlograms suggests 
that the damping of these series covers the range of damping for the drafting wave. It will be 
seen that the effect of the phase difference is not very great; in fact it is no greater than the errors 
of measurement for the drafting wave; the chief trouble is, of course, that it is a systematic 
error. We should not therefore be very far wrong in taking our measurements on the drafting 
wave as approximations to the wave-length of the undisturbed damped motion. The most 
striking features of the table are, however, the big difference between the observed and calculated 
periods for series 2, and that generally the differences between observed and calculated values 
are greater than the effect of the phase difference. 

This argument, however, applies to auto-correlated series according to equation (3), which 
corresponds to a damped simple harmonic motion. There is, however, no evidence that the 
drafting wave is analogous to such a motion, and it seems, therefore, rather unsound to apply 
conclusions derived from the auto-correlated series to the drafting wave. 

Further progress might perhaps now be made by an attempt to analyse the main peaks of the 
drafting wave correlograms into their components, but for the present it seems that the best we 
can do is to adopt the common-sense view that the position of the first maximum indicates the 
periods over which the time series tends to repeat itself, and that this period is connected in some 
way with the underlying mechanism which produced the series. The work on the drafting wave 
was in fact continued on this assumption ; the wave-length was related empirically to the draft, 
the roller setting and the twist in the original roving. These relations when combined with theo- 
retical calculations of the motion of the fibres enabled the amplitude of the wave to be calculated. 
The calculated amplitudes agreed reasonably well with the measured variances of drafted rovings. 



1946] Time Series and their Application to Textile Research 59 

In order to provide an approximate comparison of the periodogram with the correlogram, 
the records of Fig. \2{a) and (6) have been analysed in the Grating Periodograph. The periodo- 
grams are given in Fig. 13. The amplitudes of the periods were not calculated, but were estimated 
from the appearance of the fringes in the periodograph ; they are therefore only rough estimates, 
but the periodograms are sufficient to show the grouping of the periodogram components around 
the wave number obtained from the first maximum of the correlogram. Fig. 13(^), which corre- 
sponds to the less heavily damped correlogram, contains a larger proportion of periods of high 
amplitude than Fig. 1 3(o). 




Periodograms of the Drafting Wave in Cotton Rovings, (a) Periodogram of Fig. 12a. (b) Periodogram 
of Fig. 12 (h). Arrows indicate wave numbers from the correlogram. 

V. Other Applications of the Correlogram 

It is worth while to mention briefly two more direct applications of serial correlations to 
textile research. All cotton yarns are irregular, the irregularity being the resultant of the drafting 
waves introduced at all the drafting processes, random variations in the distribution of the fibres 
along the yarn, and variations due to mechanical defects in the machinery. Since a chain breaks 
at its weakest link, the strength of a given length of yarn is the strength of the weakest place, 
which is nearly always the thinnest place. The measured strength of a yam consequently decreases 


60 Foster — Some Instruments for the Analysis of [No. I, 

as the length of specimen broken increases, and its strength properties can be completely defined 
only by the complete relation between strength and specimen length. This relation has been 
calculated by Peirce “ for a chain in which the links are assembled at random. Naturally the 
results do not agree with those observed for yarns, for in yarns neighbouring links are correlated. 
Obviously a knowledge of the correlogram for a yarn would allow the complete calculation to 
be made. 

Another interesting question, to which, however, the application of the correlogram would at 
present be only of academic importance, is the formation of bars and other patterns when a 
periodic variation in the diameter of the yam happens to be simply related to the width of the 
cloth. The liability to form such patterns is obviously related to the heights of the peaks and 
troughs of the correlogram. An example was the roving of the correlogram of Fig. 1 1(^). This 
formed a pronounced spiral pattern on the bobbin on which it was wound when the circum- 
ference of the bobbin was nearly an exact multiple of the period length, whereas such patterns 
were not formed by the rovings with more highly damped correlograms. 

VI. Conclusion 

It will be seen from this outline of the applications of periodograms and correlogram to some 
of the problems of textile research that these methods of analysis may serve two distinct purposes. 
They may, as in the two problems indicated in the last section, be used purely to describe or 
characterize the series for some other specific purpose, or they may be employed in an attempt 
to discover something of the physical causes, which produced the series, and to measure properties 
of the series such as periodic times, amplitudes, or damping coefficients, which may be related to 
the corresponding properties of the underlying mechanism. It is in this latter type of application 
that the greatest difficulties arise, and I should like to suggest that valuable progress might be 
made by a more physical approach to the problems involved. Up to now it seems that most of 
the work on time series has been done either on meteorological, economic or other series, for 
which little or nothing is known of the underlying mechanism, or on artificially generated series 
for which the process of generation is purely mathematical. While it is true that the simple 
auto-regressive scheme does correspond to a definite physical system— namely, one that executes 
damped simple harmonic motion — and that there seems little need for a more complicated scheme 
at present, there are several kinds of physical systems, in which the causes of oscillation are 
physically, and sometimes mathematically, very different. For example, there are control systems,^ 
such as thermostats, which tend to oscillate about the temperature to which they have l^en set^ 
and relaxation oscillations,^^ in which the damping is negative for small displacements, and 
positive for large displacements, so that the system is unstable, but nevertheless oscillates with a 
definite amplitude. This type of oscillation is especially interesting, as it seems likely to respond 
to disturbances rather differently from the others. A damped harmonic motion, for example, is 
maintained by the disturbances; consequently its mean amplitude is determined by the magnitude 
and frequency of these, and the damping of its correlogram is that of the oscillating system itself ; 
on the other hand, the amplitude of a relaxation oscillation would probably be little affected by 
disturbances, which could therefore affect only the phase. There is some evidence that the 
drafting wave is an oscillation, of this type, and that its variations in amplitude are the result of 
the averaging of the more or less independent waves in different longitudinal sections of the 
sliver. 

^ The effect of disturbances on these and possibly other types of system could be studied 
mathematically and also experimentally. Models of them, with any required characteristics, 
could be constructed, and would be a very convenient means for generating artificial series. Such 
a study would lead to a better understanding of the meaning of the results of correlogram or 
periodogram analysis, might, by allowing the characteristics of different types of oscillation to 
be recognized, enable these to be interpreted more precisely, and might also suggest alternative 
and more suitable methods of analysis. ^ 

Finally, it is perhaps worth while calling attention to the drafting wave-time series as a 
practically unlimited supply of material for statistical experiments. Samples can be of almost 
any length, and within limits we can control the wave-length, the amplitude and the damping, 
and, if necessary, can add simple harmonic variations of any required amplitude and length. 



1946] 


Time Series and their Application to Textile Research 


References. 

* Callender, A., Hartree, D. R., and Porter, A. (1936), Phil. Trans. Roy. Soc., ,4835, 414-444. 
‘ Foster, G. A. R. (1930), J. Text. Inst., 81, T18-28. 

» Foster, G. A. R. (1936), ibid., 87, T37-52. 

* Foster, G. A. R. (1945), ibid., 86, T229. 

‘ Foster, G. A. R., and Martindale, J. G. (1946), ibid., 87, Tl-12. 

* Gray, T. S. (1931), J. Franklin Inst., 818, 77. 

’ Kendall, M. G. (1944), Biom., 33, 105. 

* Kendall, M. G. (1945), J. Roy. Stat. Soc., 108, 93-141. 

* Martindale, J. G. (1941), /. Text. Inst., 88, T71-82. 

Martindale, J. G. (1945), ibid., 86, T35. 

“ Peirce, F. T. (1926), ibid., 17, T355. 

“ Spencer-Smith, S. L., and Todd, H. A. C. (1941), Suppi. J. Roy. Stat. Soc., 7, 131-145. 

” Van der Pol (1926), Phil. Mag., 8, 978; (1928). ibid., 6, 763. 

“ Yule, G. U. (1927), Phil. Trans. Roy. Soc., A, 886, 267-298. 



62 


Cunningham and Hynd: 


[No. I, 


Random Processes in Problems of Air Warfare. 

By L. B. C. Cunningham and W. R. B. Hynd. 

In the course of the last two years there has been a striking increase in the interest shown in the 
application of correlation theory to armament studies; previously, it had been appreciated that 
much was lacking in the various early mathematical descriptions of weapon performance, but only 
recently has the remedy been found to lie in the determination of covariances and in the subsequent 
performance of various operations on the resulting autocorrelation functions. It is now becoming 
more and more widely recognized that if some military technique requires the measurement of 
any parameter upon which some fluctuating error is superposed, then it will in general be fruitful 
to study the autocorrelation of this error. We will subsequently illustrate how this is borne out in 
the particular cases of the hunting error of a radar set, employed by anti-aircraft gunners to deter- 
mine the position of a target aircraft ; or of a pilot trying to fly on a beam at a constant speed, in 
ground-controlled radar blind bombing ; or of an air-gunner attempting, despite the wander of his 
aim, to keep his guns trained on a hostile aeroplane. In the computation of correlograms for these 
problems, we have been saved much tedious and time-consuming work by the employment of a 
special relay computer which will also be described later. 

In war-time it has rarely been necessary to consider very thoroughly the levels of significance to 
which the autocorrelations were determined : the special computer has made it possible to corre- 
late a large quantity (often all) of the available data of a particular type, and the result had to be 
regarded as possibly not the true autocorrelation function, but at least the best estimate which 
could be made in the circumstances, and from which conclusions had to be drawn. 


Anti-Aircraft Gunnery : Radar Tracking 


The well-known method for obtaining the spectrum of a random process by evaluating the 
Fourier cosine transform of its correlogram has been used in this country by Shire and Runcorn 
of the Radio Research and Development Establishment, and in America by Phillips, Weiss and 
others of the Massachusetts Institute of Technology, in studies of the radar control of anti- 
aircraft gunfire. The radar set involved, whether automatic or manually operated, will not give 
smooth information, but will have oscillatory errors superimposed upon the true values. In 
normal practice these errors are to some degree smoothed, but there is a limit to which this is 
permissible, else the lag so- introduced will cause larger errors at the gun than those due to erroneous 
fluctuations in the information. It follows that while high-frequency errors may be almost elimin- 
ated, those with periods of the order of a few seconds remain. The predictor receives data which is 
accordingly somewhat unreliable, and its purpose is to make an estimate of the future position of 
the target aircraft, from which may be determined the necessary azimuth and elevation of the guns 
and fuze-setting of the shell. In this predictor stage the various harmonic components of the error 
are magnified by a factor which is a function of their frequency. This function differs markedly 
from one predictor to another, and it is thus possible to select an optimum when the error spectrum 
of the particular radar set is known. 

In Fig. 1 is drawn the correlo^am for errors in elevation of a typical anti-aircraft radar set, 
together with an estimate of the corresponding si)ectrum, while Fig. 2 shows the magnification of 
error graphed against frequency for a particular predictor. 

Instead of describing a predictor by its characteristic curve of error magnification, we may study 
its rt^ponse to a “ Heaviside impulse function.” The impulse function (also called the Dirac 
delta function) is defined by the equations 


r -= 0; / 0 1 f 

I _ 0^ . ^ _ Q j ^ J ^(Oc/t 


1 


( 1 . 1 ) 


/ ^ 0 ) 

O'" 

Let the predictor response be g(/) to such an impulsive input. Then its response to any inpu 
/(/), will be given by 

/•/ /-« 

( 1 . 2 ) 


<^(0 = f As)g(t — s)ds ^ I f{t — 5)g(j)</s . . 

^ - 00 .'0 



1^46] Random Processes in Problems of Air Warfare 63 

showing that the output is (proportional to) a weighted average Of all previous values of the input, 
and that g{i) serves as a weighting function. 

If we stop to consider the role of the predictor we may obtain a more complete understanding 
of the significance of this weighting function. For simplicity, we will consider only one dimension, 
and assume that the target is flying on a straight course at constant speed v ; then its actual position 
at time t will be vt, while the radar set will record its position as being 

m==vt+e(t) (1.3) 

From the values of this input function /, the predictor has to make two estimates: (i) an estimate, 



/i(0, of the true position of the aircraft, (ii) an estimate, r/(r), of the true velocity of the aircraft, 
where a dash denotes differentiation with respect to time. 

From these estimates the predictor deduces that the aircraft will be at a position [fit) -f- 
TfUit)] at a time Tf later, where 7} is the time of flight of the anti-aircraft shell ; the predictor’s 
efficiency will be* assessed by comparing this “ output ” with the true value {vt + v7}). 

Up to the present time, at least, predictors have been constructed from linear networks, and 
therefore /i and f^ must be weighted averages of preceding values of the position and velocity of the 
aircraft, so that 

/i(0 ^ [ /(s)giU — s)ds 
/ - 00 


(1.4) 



64 

and 


Cunningham and Hynd: 


[No. 1, 


/^'(O- r ns)g,it - s)ds (1.5) ' 

y — 00 

where gi(t) and g^O) are the corresponding weighting functions which satisfy the conditions 

I g2m^l ( 1 . 6 ) 

-'o •'0 


Now, it is a property of §'(0. the first difference of the Dirac delta function, that 

^ b < a < c 

otherwise 


so that 


- a)du { ^ 
g,0 - s)S'(u - s)ds { "" ' 

y-oo 1=0 M > / 


. (1.7) 



Frequency (cycles/sec.) 


and therefore * 


Fig. 2. 





. g2(t — s)/'(u)S'(u 


■r) 


= 1 — u)du ( 1 . 8 ) 

y — 00 

From the results (1.4) and (1.8) the predictor output is given by 

G(0 = AW + r,A'(/) = f /(s){g^(t - sy+ - s))ds 

and as this must be identical with (1.2), it follows that 

gis) = g^{s) + Tg^\s) (1.9) 

* The transformation from equation (1.5) to (1.8) might also be achieved by the careful application of 
the process of integration by parts, due cognizance being taken of the fact that g</) has a finite discontinuity 
at f = 0. 



1946] Random Processes in Problems of Air Warfare 65 

giving an analysis of the weighting function into two terms, the one concerned solely with smooth- 
ing, and the other with smoothing and differentiating, and the weighting function for velocity is 
obtained from the second term by the rule 


^ f go'i 

•'0 


where gzXO is the coefficient of Tj in g(/). 

Using equations (1.2) and (1.3), the error in the predictor output is given by 

E -- G{t) - (v/ -i- vr^) - f [v(/ - s) f e{t - s)]g(s)ds - v/ ~ vT, 

-00 .00 

V/ I g(s)ds — V/ — V I sg(s)ds —■ vTf -h I e(r — s)g(s)ds . . . (I.l 1) 

A) •'u A) 

When there are no errors e(f) in the radar data, the predictor output error must also be zero, and by 
equation (1.11) this requires 

^00 ^oo 

g(s)ds-^l and j sg(s)ds - Tj (1,12) 


The error in predictor output is thus given in general by 


. (1.12) 


I e(t — s)g(s)ds (1.13) 


It is helpful to enunciate here a very general theorem, that if Oi and O 2 denote any two linear 
operators, the coefficients of which are functions of time, and if X(r) be a random variable with 
variance a/ and normalized autocorrelation function pj, then 

cov[d,XM; 02,Y(v)] - - v) (1.14) 

Together with (1.13), this theorem readily gives the variance of Bin the form 


fCO foa 

p.(«- 

JQ ->0 


y)gUi)giv)dudv 


. (1.15) 


where and pj(t) are the variance and autocorrelation function for the error e{t) in the radar data. 

It is now possible by standard variational procedure, to choose ;^<r) of such a form as to minimize 
a/ while always satisfying the restrictions of equations (1.12). The result is that 

I g(n)pX^ — ^ A \- Bt (1.16) 


where A and B are constants to be determined by the equations (1.12). 

There are practical reasons for introducing as a further restriction on g(t) that its value should 
be zero when t is greater than a critical value T. This implies that the estimates of aircraft position 
and velocity will be based only on radar data supplied to the predictor during the time-interval T 
immediately preceding. Clearly T must not be too large, for the aircraft presents itself as a target 
to the gun only for a short period, and even within that time it may frequently vary its velocity or 
heading, so that our assumption that it flies a straight course at constant speed will be valid for 
only yet shorter periods. This restriction on g(i4) may be expressed in the mathematics by replacing 
the upper limits of integration in (1.12) and (1.16) by T : the optimum weighting function is thus 
the solution of the integral equation 

fT 

/ g(n)p,U — «)dw A i Bt (1.17) 


where A and B are determined by the conditions 


[ g{u)du = 
JQ 


ug{u)du = - 7> 


(1.18) (1.19) 


The solution of (1.17) will now be obtained in the particular case when the errors in the radar 
data are of Markoff type, so that 

P.(0 = e-'i'l (1.20) 

D 


SUPP. VOL. Vin. NO. 1 



66 

Clearly 


Cunningham and Hynd: 


fT ft fT 

j — u)du = j *f j ^^*"**W« 




!,“A< _ 


") 


SO that 

and 

SO that 


^ j2 — J p,(r — u)Ku)du ~ pXt — 

[T 

j up,(l ~ u)du —- V — -f- 

1^ p,U — «)8(« — T)du + ^ P.(' ~ w)[' 


It __T \ 
X' X 


S(«) 


It follows that equation (1.17) will be satisfied by 
g(«) = ^ {x + 8(«) + 8 (k - r)| + ® jx;/ + rSO/ -T)- ^8(«) 4 - is(M - 
Substituting equation (1.21) into the conditions (1.18) and (1.19) leads to 


and 


1 = l\(u)dit = ^ (xr + 2 I + f {^xr“ + r} 


c T 

- Tf ^ iii!(u)du - 2 { J xr + r} 2 {5 
which may be solved to give 


A 

B 


SO that 


sx^r^ -f 24xr I 24 -h^i2x7yxr J 2 ^ 2 ) 
xy r •» 1- 8X2 f 2 4 . 24X r -I- 24 

p ± 

T • X^r^ f 6Xf+ 12 


’ giu) 


| 4 xi‘r» + i 2 xr-i - 12 + 6 X 7 }(xr-f 2 )| x - | 6 ( 2 xr/ + 

+ |4X»r* + 18xr + 24 t- 6-^(Xr + 2)=“! 8(«) 


-- |6xr 1 - 2x»r* + -,p(xr i- 2)“ } 


8(h 


x»r» i 8x>r“ 1 24xr + 24 


Integrating the coefficient of Tf in this case shows that, according to equation ( 
weighting function in this case is of quadratic form : 


f __ ^ + ^"*7" “ 

g,iu)\ r* 12 rexf'i "X2r2 

I ^ O for w > T 


in the range O i_u 


gW 


1 

J3 


[r*- 6«r-f 6Tf(T- 2«)] 


(No. 1. 


- T)du\ 


S(h - d] du 

- r)}. (1.21) 


Xm 

T) 

■ ( 1 . 22 ) 
.10), the velocity 

. . ( 1 . 23 ) 


For XT large 



1946] Random Processes in Problems of Air Warfare 6T 

and 

? 2 («) ^ [tu - /!»] : 0 < U < T 

— forms which are known as “ Bode smoothing functions," useful when the error in the radar 
data approximates to uncorrelated random noise. 

At the other extreme, when XT is small, 

g(u) — > 8(«) f ^'[s(//) - 8(r)] 



i?s(k) — >■ 0 n M < r 


SO that in this case, when the error is highly correlated over the smoothing interval, the velocities 
are smoothed by taking their unweighted arithmetical average. 

Graphs of the function gziu), as given by equation (1.23), are provided in Fig. 3 for various 
values of XT. 

Substitution of the particular forms (1.20), (1.23) and (1.24) into equation (1.15) leads to the 
following results for the variance of the error in the predictor output : — 



68 


Cunningham and Hynd: 


[No. 1, 


In the case of Bode smoothing 





20 72 _ 8 / 

3^2 r ^ *1- 


9 + 9 

jT -r JI2 





i+p3-y(l + 5-+rJ^'^)] 

. . (1.25) 

In the case of delta-function smoothing 



a/ = 

".•[l + 2f 

t 

1 

. . (1.26) 



2 3 4 5 6 8 iO 15 20 30 40 

Fig. 4. 

The corresponding formula in the case of ideal smoothing is rather more complex, but may be 
obtained from equations (1.15), (1.20) and (1.21) in the form 

V = f 2^ ®)’‘+ + . . (1.27) 

where A and B are functions of T, 7} and X previously specified. The expressions (1.25), (1.26) 
and (1.27) are all illustrated, for a particular value of XT), in Fig. 4 (logarithmic scales). 

To close this discussion on the feeding of radar data to various predictor circuits, we might 
make two comments. It is possible, and indeed almost necessary, to introduce some smoothing 
stage between the radar output and the predictor input. This operation may easily be introduced 



1946 ] 


Random Processes in Problems of Air Warfare 69 

into the mathematics, for the output of these additional circuits will have a variance and auto« 
correlation which may be expressed using equation (1.14) in terms of the corresponding statistics 
for the unsmoothed original radar data, and of the response of the smoothing equipment to an 
impulse function. The above theory is then applicable to these new statistics, which describe the 
predictor input. 

No mention has as yet been made here as to the duration, T, which should be chosen for the 
smoothing interval ; indeed, in the case of formulae (1.25) and (1.26) and the corresponding Fig. 4, 
it is clear that the greater the value of T the smaller is the error in prediction, and this would in 
general be found true when the radar errors had other autocorrelation functions. For reasons 
mentioned earlier, this is in disagreement with physical arguments, the discrepancy being due to 
our failure to introduce into the mathematics some measure of the fact that with increasing time 
the aircraft’s position is more and more likely in practice to diverge from the value, vr, which has 
been assumed. Studies of this latter effect might be based on recorded tracks of aircraft over gun- 
defended areas, the statistical analysis of which would again require the computation of something 
of the nature of an autocorrelation; this part of the problem is similar to a random walk in which 
the steps are of variable length and there is a restriction on the angle between successive steps. 



With this information analysed, it should be possible to improve upon the above analysis and to 
determine an optimum smoothing interval, long enough for the radar error to be effectively aver- 
aged, but short enough for only a small error to accrue from the freedom of the target aircraft to 
change its speed and heading. Going farther in this direction, it would probably be possible com- 
pletely to eliminate the rather arbitrary restriction 

g(//) — 0 for ii > T 

and to determine overall optimum weighting functions which would define both the predictor 
circuits and the smoothing circuits which may be introduced between the radar equipment and the 
guns. 

Non-Visual Bombing: Beam Flying 

A somewhat similar problem of radar ranging, smoothing ^nd differentiating arises in ground- 
controlled “ blind ” bombing, and has been studied by us on behalf of the “ Oboe ” system used 
by the Pathfinder Force of Bomber Command. The broad details of such a blind bombing 
technique are illustrated in Fig. 5. 

To assist the bomber pilot in his attempt to fly on a certain straight path which will take him 
over the target, he is provided with continuous radio information as to his distance from this in- 
tended course; a highly-accurate radar equipment meanwhile determines the range of the aircraft 



70 


Cunningham and Hynd: 


[No. 1,^ 

from a ground station in friendly territory, and estimates the aircraft’s forward velocity from the 
rate of change of this range. Knowing the time which the bombs will take to drop to target-level, 
it is then theoretically possible to predict the range at which they should be released, and when the 
aircraft reaches this range it receives instructions to this effect from the ground station. 

One feature of distinction between the predictor problem and the present case is that the be- 
haviour of the aircraft is much more restricted ; its statistical description is therefore comparatively 
easy to determine, and has been fully included in the mathematics which follows. It is also im- 
portant to note that the “ jitter ” inevitable in the radar ranges is constituted from much higher 
frequency components than is any of the other variables involved, so that radar errors may be 
completely filtered out and eliminated. There remain two sources of fluctuation which are of 
importance — fluctuations of the aircraft to one side and the other of the intended track, and 
fluctuations in the aircraft’s forward velocity (such as might arise from small changes in engine 
power). 

We will assume that the ground station is at a great distance from the target, so that the angle, a, 
between the intended course and the line joining aircraft to station, may be regarded as constant 
throughout the bombing run. 

Let x(t\ y(0 and r(/) — variables all dependent on the time /—measure the coipponents of the 
distance between target and aircraft in the directions along the intended track, perpendicular to this 
track, and towards the ground station. We have 

r = jc cos a — y sin a or jc “ r sec ot + v tan a . . . . (2.1) 

The ground station has no information on the value of y(t) at any time, and so must attribute 
all changes in r to changes in jc : it therefore makes an estimate of the aircraft’s instantaneous 
forward velocity in accordance with the equation. 

— sec a . r 
^ X — y tan OL ; 

then, realizing that this value is subject to error, the ground station may modify it by taking a 
weighted average value, so as to smooth out to some degree the undesirable y term, at the cost of 
losing instantaneity in the x term. Accordingly, the aircraft velocity is assessed by the ground 
station as being 

y(0 j |x(w) - y(«) tan a|jEr(/ — (2.2) 

where g(0 is the weighting function introduced in the smoothing process, and satisfies 

I 1 (2.3) 

•'o 

A release signal will be sent to the bomber when its range from the target is r(/^), at time /„ 
satisfying 

sec a . r(0 K(/,) ~ 7; | |y(w) - yOO tan ajjirl/. - w)c/// . . (2.4) 

where is the time of fall of the bomb. 

The point of impact of the bomb on the ground will then be (X, V), where 

X ^ x(0 4- Ti,x(rj 

y y(fr)+rjio 

ut 

x(/) - V 4- e„(0 

so that e,(/) is a random variable w\th zero mean. 

Then using equations (2.5), (2.7), (2.1) and (2.4) irv successive steps: — 

X ~ x(/f) 4- vTj 

- r{t) sec a f yiO tan a vT^ + T.eJiQ 

^ r^[^e(/,) — — u)dii^ I tan a[^y(/,) \ j yMgUr — u)du^ 


(2.5) 

( 2 . 6 ) 

(2.7) 


( 2 . 8 ) 



1946] 


71 


Random Processes in Problems of Air Warfare 


As the angle between the intended and achieved aircraft tracks is usually of the order of one 
degree or less, it is reasonable to assume that e and y are uncorrelated. If a/ and p,(/)» pjit) 
are the variances and autocorrelation functions of these variables, the application of the general 
result (1.14) to equations (2.8) and (2.6) leads to the following expressions for the variances of the 
components of the bombing error : — 


/ OO 00 .00 

— 2 p,(u)g(u)dtt + P,(« — v)g(u)g{v)dudv^ 

00 ,0000 

tan* a<j/ fl — 27; | i^fu)g{u)du ~ | p/w - v);?(w)^(v)^/w^/v1 

0 0 0 -J 


GTy* 


•4* 

*[i - ?;»p,(0)] 


(2.9) 

( 2 . 10 ) 


10 



This shows that a,., the R.M.S. “ line ” bombing error, is independent of the smoothing function, 
and only the R.M.S. range bombing error, ctv, need be studied. 

Experimental data (which was obtained on operational sorties by Pathfinder aircraft) led to the 
estimates 

“ 4-74 m.p.h. 

<7^ ^=30 yards (2.11) 

Values of p/O were computed from the same data, and are graphed in Fig. 6, together with a fitted 
curve of the form 

. p,(0 - e ^I'lcos pL/ (2.12) 

which was used in the evaluation of the integrals in (2.9), and likewise values of p/r) are graphed in 
Fig. 7, together with a fitted curve of the form 


— (1 — flc“*'**)e"^** cos [Lt + (2.13) 

These experimental results were passed into the formula (2.9), which was then used to compare 
the efficiency of various simple forms for g{u) such as 

Case (i) g{u) — for various values of K 


Case (ii) g{u) | 


1/r in the range t < u < t -\- 
0 otherwise 


for various values of t and T. 



72 


Cunningham and Hynd: 


[No. 1 , 


A typical group of results is provided in Fig. 8, in which the range bombing error, oj, is graphed 
against the angle, a, for a variety of weighting functions g(u\ the time of bomb-fall being fixed at a 
value corresponding to a release altitude of 30,000 feet. 

It is clearly illustrated that for small values of a the minimum practical degree of smoothing 
should be adopted, but when a increases above a critical value of about 30 °, the optimum changes 



abruptly, and the more extensive the smoothing the better will be the resulting bombing: the 
variation of the bombing error with the angle a is marked when little smoothing is.' employed, but 
is relatively unimportant when long-term averages are used. These comments are valid whatever 
may be the precise form of the weighting function introduced. 

Examination of Case (ii) curves shows that if the smoothing period, T, is kept constant and the 



time interval, r, be varied between the end of this period and the moment of bomb release, there is 
no great change in the bombing error : indeed, the introduction of the quiescent interval t may 
improve performance; the curves for T ^ 20 seconds, t zero and for 20 seconds, t 20 
seconds are very similar and the latter is associated with the lower bombing error when a is greater 
than 60° : even for smaller values of a there is little harm in introducing such an interval if It should 
be advantageous for other reasons. 




1946] 


73 


Random Processes in Problems of Air Warfare 

It seems appropriate to introduce at this stage some results on continuous random processes 
which were originally obtained in connection with beam flying, but are of much wider and more 
general interest. 

If the pilot exercised the greatest possible skill, and if his actions are quite free from error, he 
will use only the ailerons of his aircraft so as to control the roll acceleration : it then follows from 
aerodynamical reasoning that, in this perfect case, the motion of the aircraft will be described by a 
fourth order differential equation 

yv ^ ^ ^ cy ^ (2.14) 

In practice, no pilot, however skilful, will achieve this : there will always be random errors in 
his efforts in controlling the motion, so that (2.14) must be replaced by 

+ ay'^' -f by' -I- cy + - no (2.15) 

where Y{t) is a random variable, with zero mean, and its successive values will in general be corre- 
lated. Equation (2. 1 5) may be written 

[D* f aD^ + bD^ -j c/) f* d]y{t) = YU) 

where D represents the diflerential operator djdt. 

Many pilots will either not possess the requisite skill, or will consider it unnecessarily refined, to 
adjust their distance off track by controlling their ailerons; instead, they will simply apply small 
rudder movements, and the aircraft’s motion will then inevitably suffer from some degree of “ side- 
slip.” Under these conditions the corresponding differential equation is approximately of the 
second order only : 

by" f- cy + dy Y(t) . .* (2.16) 

or 

[bD^ f- cD -f d]y{t) = r(/) (2.17) 


Further, the pilot’s reactions may not be instantaneous: the control which he applies at time t 
may depend on information as to his distance off track which he has received at time (/ — v) ; the 
damping term may also contain such a lag. Equation (2.16) must then be replaced by 

by(t) 4* cy(t — «) + dyit — v) = Yit) 
and the equation corresponding to (2.17) is 

[bD^ h cDe'-^ -f- de-^^yit) ^ Y(t) (2.18) 

Weighted averages may again be introduced: if the acceleration at time t depends on an average 
of all the information received earlier, then de~^^ in (2.18) should be replaced by 

1 1 ^g{v)dv 

where g{v) is, as usual, the weighting function. 

All these cases will be covered if we can examine the quite general formula 


F{D),y{t)^Y{t) (2.19) 

where y(/) has variance ay, autocorrelation function 9 AO, andj[spectrum^function given by 

2 f* 

9ri0cosa»tdt ( 2 , 20 ) 


and if we can obtain the corresponding statistics for y(0. 
Now if A denote and A denote then 


[iF(A)V(A)] 


COS (/ — s)ixi — 


cos (/ — s)oi 


( 2 . 21 ) 


F(i(o)F(— 16)) • • . . 

The residual steady state solution of (2.19) (the complementary function having died out) is 

1 



74 Cunningham and Hynd: 

Referring once again to the result (1.14), it follows that 


[No. 1, 



where <(/) is an^uncorrelated random variable : 5/co) may be represented in this case by the formula 


n(X'‘ + 


the variable c being thus regarded as the limiting form of a variable of Markolf type, with an auto- 
correlation function Define also 


Then 

and 

so that 

and also 


* / " I €(u)ciu 

I // = I (l - l+b'") large X 

a/ _ r 2>^<o 1 



5,(co) 


I 2a 3 

V + CO*) t(p — CO®)* + a*co»] 7 t[(P — co*)® 4- 


a*co®J • 


(2.25) 

(2.26) 


Formulae (2.25) and (2.26) are in agreement with the formul® given by Dr. Bartlett. 

Correlation theory might be usefully applied to other bombing problems: for instance, in 
formation bombing, each aircraft tries to keep station with respect to one preceding aircraft which 
has been designated as his ** senior.” A full understanding of the distribution of the aircraft in 
space, and of their velocities, will therefore require determination of the correlation between the 
fluctuations of one aircraft and those of another. Unfortunately, no practical data have yet 
become available which might be analysed in this manner. The effect of such correlations, and of 
the consequent correlations in bombing error for different aircraft, would be assessed by their 
influence on the chance that a target will receive a direct hit by at least one bomb, and the necessary 
mathematical formul® will resemble those which will be presented in our next problem— the effect 
of the correlation between successive rounds on the efficiency of aerial gunnery. 



1946] 


Random Processes in Problems of Air Warfare 


75 


Performance of Guns with a High Rate of Fire 

When our attention is turned to studies of rapidly-firing guns, a very different correlation 
problem is presented. As a gunner tries to keep his gun trained on a target in unsteady condi- 
tions, the point of aim performs an erratic, quasi-orbital motion which is generally referred to as 
“ aim wander,” and is plainly a typical random process in two dimensions. Its practical effect is 
that of a discrete time series, picked out by the discharge, at approximately equal intervals of time, 
of successive rounds from the gun. The kinetic reactions to successive shots might influence the 
sequence, but this is usually found negligible, so that cine-gun analysis is valid. 

Each round, once fired, does not of course proceed exactly along the direction of aim, but will 
diverge from it, as a consequence of such factors as ballistic asymmetry. This divergence (which 
has been termed ” gun dispersion,” as opposed to the aim-wander or dispersion of aim) will vary 
in an uncorrelated manner from one round to the next, and the population of these divergences is 
represented with high fidelity by a Gaussian distribution about the point of aim. 

The importance of the correlation in aim-wander must be assessed by its influence on the 
survival chance of a target to an arbitrary burst of gunfire, compared with its chance when there 
is no correlation. In the simplest case,* this is the chance that every round in the burst considered 
should miss the target, having regard both to the aiming and gun dispersions. In other problems 
in this paper we have been interested in previous values of a random variable only in so far as they 
had an influence on the value at some later time; in the survival chance, previous values of the 
aiming error not only influence future values, but each also defines a momentary risk run by the 
target, and makes its own direct contribution to the final suryival chance. No analysis of problems 
of such a character seems to exist in the mathematical literature. The greater the correlation, the 
greater will be the coherence between the bullets, and the smaller will be the equivalent number of 
independent uncorrelated rounds : aim-wander correlation thus reduces the lethality of the gun- 
fire. That this reduction may be drastic is illustrated by the following example. 

Aiming statistics obtained by analysis of cine-gun films exposed in R.A.F. gunnery trials were 
interpreted in two different ways : firstly, paying due regard to the sequence of points of aim achieved 
in each sample attack ; secondly, considering only the overall distribution of aiming errors, and 
thus ignoring the correlation. Each method was used, in turn, to compute, from the same aiming 
data, a fighter pilot's chance of success in a typical attack with a 2.J-second burst of fire from four 
cannon, the mean target range during the sample population of bursts considered being lOO yards, 
and the pilot using a fixed reflector sight. By the former (correct) method this chance was found 
to be 29 per cent., but the latter method led to a gravely erroneous value of 97 per cent. 

Discrepancies of this order between theory and practice cannot well be tolerated, and it is plain 
that a valid formula for the survival chance should be constructed. Unfortunately, the problem 
seems to be exceedingly intractable, and no closed formula has been discovered which is suitable 
for computation. The true value may nevertheless be set between limits: a lower limit is readily 
calculable by ignoring correlation between successive rounds, while an upper limit, obtainable 
rather less readily but still with no great difficulty, is offered by assuming perfect correlation, all 
rounds being discharged with the same point of aim, so that the rounds of a burst may be compared 
to the pellets scattered from a single cartridge fired in a shot-gun. This upper limit will accordingly 
be referred to as the ” shot-gun formula.” Experimental values have shown that where the simple 
theory of uncorrelated rounds is most erroneous, the shot-gun formula gave good agreement. We 
have therefore expressed the survival chance as a Maclaurin series, the leading term of which is 
actually the shot-gun formula, and as subsequent terms generally alternate in sign, partial sums, 
give successive lower and upper limits. 

For those readers who are interested in the results obtained rather than in the details of the 
methods by which they are derived, the mathematics which follows may be briefly summarized. 
After some preliminary remarks on the statistical functions which describe aim-wander, the desired 
survival chance is immediately written in a form (3.10) which is equivalent to a multiple integral: 
when the aiming errors are statistically symmetrical, this is exactly converted, by a Fourier trans- 
formation and inversion, into the single integral (3.16), in which the integrand contains certain 
operators which must be interpreted : this in turn is achieved by a series expansion, the resulting 

♦ Some targets are more complex : a two-engined aircraft, for instance, may require a hit in each of 
its two engines before it will crash. 



Cunningham and Hynd: 


76 


[No. 1, 


terms being simplified by an approximation which is valid for small targets (the replacement of 
(3.22) by (3.23)), giving finally the expression (3.25) for the survival chance. 

We will consider initially the case when a burst of A/ rounds is fired, the time-intervals between 
successive rounds being all equal to the value t. Let the error in aim for the rth round have angular 
components (ac^ y,) with respect to axes which are orthogonal to one another and to the direction 
of attack. For such bursts of fire, directed by a gunner of a certain degree of skill under a par- 
ticular set of conditions, each of these components is found in practice to belong to a normal 
population, so that their distribution functions are 


2n 


1 


P(yr) = 


1 


J/r* 


2tc 


. . (3.1) 
. . (3.2) 


Note that the conditions under which the rth round is fired may differ from the conditions 
obtaining at other times during the burst, so that the suffix r must be added to the standard devia- 
tions. 

Further work will be expressed in terms of normalized aiming error components w;, v,. where 




2a.. 




2<j„ 


(3.3) 


Presuming that aim-wander is a stationary process, it then follows that these normalized com- 
ponents will satisfy multivariate normal laws, so that 


...</««= I ^'2^ exp 1^- S dui. . . duj, 


(3.4) 


defines the chance that the nominated sequence of normalized wander components will be observed 
within differential tolerance limits, at the nominated sequence of instants. 

In (3.4), a^r - ^rm and the quadratic form is positive definite. 

Defining 

/4-r i^-| X [co-factor of a„, in |a^|] 

it is well known that measures the correlation between and w., so that we may write 

= Pud'W- ''K) (3.5) 

Another standard result is that 

. . . «,)] = I [C(ai, . . . o ^ m )] - I [<?(«i • • «.,)] • • (3.6) 

where denotes the expectation of an arbitrary function, F, of the w., which are distributed in 
accordance with (3.4); denotes the expectation of a function of other variables, a^, which are 
distributed in accordance with ^ 

P(ai, . . . ajf) exp S ; 

and the function G is the Fourier transform of the function F, so that 


G(a 


». . . . ^m) '== . . . Ue 


2iiXcUr) 


The above remarks have been confined to the w components of the aiming errors, but the v are 
also distributed in accordance with a corresponding multivariate normal law : — 

P(vi, . . . Vj,)rfvi . . . = -1^- exp [- dvi. ..dvu . ■ . (3.7) 

and in this case 

Pm. == [co-factor of in I Z>„,|j= P.d m-r|T) . . . (3.8) 

Combining (3.4) and (3.7) the m’s and v's being uncorrelated, it is possible to write 
P(hi. ...«*: v„ . . . vj,) ^ exp [- - S 6^v„v,] 



1946] Random Processes in Problems of Air Warfare 7 

and from this it follows, just as (3.6) followed from (3.4), that 

Vi, . . . V*)] - |, 4„,|-*|B.„|-l£'^,[G(ai, ... a,; p„ . . . p«)] 

1 r 1 /** 1 f” f —IX-liwam**/- + . 

= ^/ai . . . dctja I f/Pi . . . . . . a v ; 3i, . . . Pjf)e (3.9) 


where C?(ai, . . . ol„; Pi, . . . Pj^) is the Fourier transform of the arbitrary function F(wi, . , , UmI 
Vi» . . . Vjj,) and the significance of the notation is obvious. 

Let the rth round, fired when the normalized components of aiming error are w, and v,., have a 
chance Cr(Wr, v^) of missing its target. Then the chance that the target will survive the burst of Af 
rounds is given by 


Applying (3.9), 


^31 n Q,(w„ V,) 

# — 1 




irfit, n«»r+ ^mrPmPr) 


. . . (3.11) 


G.(«.. W = d^. ' "‘’'•^2.(5..^).) 

,/ — Z J — Ij 

L being a very large finite limit, such that the chance of erring by so much with either component is 
negligible. 

Inverting this transformation leads to 

n QM, K) - n / da, f dp.c.(a., + 

• *» 1 # - 1 y « y — go -J 

so that if/ be an arbitrary function satisfying certain broad and obvious conditions, 

/[- is (d - d„)55,- K. - B ,, „?-}] fi o.(»„ «.) 

If 1 / r* 

+ |(1 - -I- (1 - 5„,)p„p,|]] (3.12) 

As a particular case of (3.12), the inverse Fourier transform of 

i: (a - A^)a^ar + (1 “ ffmr)PmPr\ M 

^ II G,(a„ p,) 

tBT, I 

is 

i J fi V.) (3.13) 

* “• 1 

When the u and v components are each completely correlated, so that A,„r -= == 1, let Eq denote 

the expectation either of a function of the «’s and v’s or of a function of the a’s and p’s. Then (3.9) 
becomes in this special case 

Eq. E(wi, . . . Uu\ Vi, . . . Vn) — Eq. C»(ai, . . . a^,; Pi, . . . Pjf) 
so that, using (3.13) 

^”[®xp|~ 4m^[(‘ ~ 'Ou„hdk 

= £o[exp[s{(l - /l,Ja„a, T (1 - i?,.,)p„fi,}] H C.(a., p.)] 

= n j'“ da.|“ dP.G.(a.,p.)]exp[^{(l - /l„,)a,.a, -|- (1 - B„,)p„p,} ~ + p,.p,)] 

Oj, in accordance with (3.11) (3.14) 

Also 


Eo . E(wi, . B .Um\ Vi, . . . Vjf) = - j </wj 


dvF{Uy u, . B . u; V, V, . . . v)^ 



78 Cunningham and Hynd: [No. 1, 

The ultimate formal expression for the survival chance is therefore 

in the integrand of which all the //, are to be equated to if, and all the v, to v, after the differential 
operators in the exponent have been fully interpreted, and before the integratidns with respect to u 
and V are carried out. 

If the aim-wander be statistically symmetrical, then 

Pm (0 P. (0 = P (0 

and (7„, arc equal to a common value, so that is the standard deviation of the radial error 

V 2 

in aim under the conditions associated with the rth round. 

The polar co-ordinates of the point of aim will then be y,) where 

w, == 1- v/)i =- the normalized radial aiming error 

y, tan"^ 

If, 

The expression (3.15) may then be somewhat simplified, for G,(//„ v,) will become a function of the 
single variable (X.w,) where 


and (Sg, is the standard deviation of the gun dispersion. 

The operator 

[(1 - AJ su„bu, 

then becomes 

and is to operate, according to (3.14), under the conditions implied by the operator so that 

y, -= y„. and w, == w„, 

giving further simplification to 

[l - P(|«, ~ r|T)](^) 

Expression (3.15) therefore takes the form 

[i- Klm -|T)lCf) (/) « 



- ^ 2 | 

— 2 / wdwe 
•'0 

1 "•’’1 

1 '°° - tP* 

1 V 

— 2 wdwe 

4 »jrr 

J{) 





QM . . . . (3.16) 


where /)„, for instance, operates only on turning it into Q„', without regard to the argument of 
the function. 

We have obtained no more satisfactory method of interpreting the operators in the exponent of 
(3..16) than by a McLaurin series expansion. Writing 

Smr = UmMl “ P( 1 W — r|-c)} 

and noting that p(0, and therefore g^, are even functions’ of time, then 


where 


gm, = i'r“^m,(0) f At*^„(0) + 0(t*) 

g..(0) = - ix„\(r - m)>p(0) 

'L(0) = - - rn/m 



1946] 

Hence 


Random Processes in Problems of Air Warfare 


79 


exp [ - S g^D„D7\ - 1 - S - -h 5:!L(0)D„/), 

L. mr J mr mr 

+ |T‘(Sg,„(0)D„D,)“ + O(T»). . (3.17) 

Let ^,(X w) = log G.(X,h>), and denote D.Qog Q.) by q,'(\w), D.^log Q.) by ^."(X.m') ; then 

-^w(0)O„Z),. II o,(x,w) = n G.(\*‘’)^"y^.r(0)^„'(x„^v)^/(x,^*')1 . . (3.18) 

mr • =» 1 « 1 Lmr J 

iv jf r iv -1 

s g.„(0)D„D, n (?.(X.H-) - II e.(X.a-) S g„X0)^,/(X,„H-)<//(\H-) . . (3.19) 

'nr $ cz I M =zl L»ir J 

rsg,„,( 0 )O„D,]‘ n (2.(\H')= n (2.{x.M-)r2Sg„xo)<7,»"(x,„uO<7,"(x,M') 

Lmr Jf = l « = 1 Lmr 

F 4 2 g„„(0)g„MgJX\Mg./M4/M i- 1 ^ g..r(0)a,/Mg/aMyi ■ ■ (3.20) 

m«r I mr J J 

Substituting (3.17), (3.18), (3.19) and (3.20) into (3.16), and replacing the summations by inte- 
grations, and assuming that X,, cr,,, and are all constant in time and equal to the values X, and 
a^, then the survival chance against a burst of fire of duration Tis obtained in the form 


f°° — u * + vf q(Xu<,t)cU p 

0(7") = 2 I wdwe ° 

• () L 


■'0 
q, y. 


1 + gV^XSj 


*p(0)f f (ti - ti)WO-w,t,)q\\w,tddt,dti 

'{) •'0 

+ if I — hy<l"0'W,ti)q"{Xw,t^tidt.i 

I- — v»X<p'^(0) ( [ j Oi — tifUi — /,y^''(\w,fi)f(Xw,t.j)e/'(^w,t3)d/idtidt3 
! iyg''‘^V(0)j^|^ I (t, - ■ /2yf<^**'>fi)g'i>^*<’di)d/td/3j’‘ 

ly fT f7' 

v-X» p (0) (t, - t.^q\Xw,t 3 )q\Xw,h)dt,dt 3 j (3.21) 


'■96 

where v is the rate of fire, so that 


M = yfT and 


1 


V — - 

T 


It should be remarked that any small discrepancy which may exist between the results of inte- 
gration and summation serves the interests of realism by making allowance for irregularity in the 
time intervals between successive rounds. 

If the target be circular, of angular radius R, then 


e(X>v,r) = 2 e (3.22) 

Jm 


where /q is the Bessel function of imaginary argument of order zero, but a highly accurate approxi- 

R * 

mation to (3.22) in the range - - : 1 is provided by 

Q(Xtv,t) = exp ^3 e •'•“'■J = exp [- (3.23) 

where 


r = Xvv and 


Q 


n being the solid angle subtended by the target at the gun. This angle, and therefore z, will vary 
with the firing range and therefore with time. We shall write 



80 Cunningham and Hynd: [No. 1 , 

Adopting (3.23), it follows that 

q{y^w,t) = -- z(j)e '^ 
q'i'kWyt) — 2z{t)re~^ 
and 

q'X'kWyt) - 2z(/)(l - 2r*)^- 

so that (3.21) becomes 

rdre~''' [l + jPWXVrV (/, - 

"T rl' 

4 - i v«X<p»( 0 )(l - - l^Yz{tMh)dtxdh 

+ 2V»X‘p’'(0)(l - 2r^)r^e--^’' f ^ f * I* (/, - fs)"(ri - hYz{lMh)z(h)dtidt^h 

^ -'o •'o 

. yt • yi 

■+- Jv‘x‘p«(0)Hc-«'*[ (ri - 

+ 24v*X»‘p(0)r»^ (3.24) 



The internal integrals in this result, with respect to time, may clearly be determined as soon as the 
change of air-range with time has been prescribed. As these integrals do not involve r, they behave 
thereafter like constants. The integrals with respect to r are all of the form 


where m and n are integers. 
It may be shown that 


/(w,/i) 


',/i) = I ' dr 

Jq 


( 3 . 25 ) 


where y (.v,y) is the Incomplete Gamma Function. 

In the particular case when the air-range is kept constant, equation (3.24) integrates to the form 


1 


iv 


0(r) - Co(X,Z) + Ci(X,Z)p(0)r* + Ci(X,Z)p (0)r« + C,(X,Z)p*(0)r* + higher terms (3.26) 


where Co(X,Z), Ci(X,Z) and C2(X,Z) are expressible in terms of the /(w,/i) functions and have been 
tabulated. 

Examples of typical gunnery autocorrelation functions are given in Fig. 9, where experimental 




1946] 


81 


Random Processes in Problems of Air Warfare 

values of p(/) are graphed against time : one of the two curves has been obtained from the values of 
the elevation component of the aiming error when a certain modem method is in use for con- 
trolling the tracking of the guns in a bomber turret, the error being recorded by a gyro cine-assessor : 
the other curve is based on the corresponding data for the traverse component. 

The equation (3.26), arid similar equations involving variations in air-range, have been used as a 
basis for a variety of calculations. As an example. Fig. 10 shows the variation of the lethality, 
measured by — log O, with the aiming and gun dispersions, when a one-second burst of fire is 
directed from a single cannon against a target of area 5 square feet at a range of 200 yards, the rate 
of fire and autocorrelation function being typical for such conditions: the values of — log ^I> are 
plotted against and each curve corresponds to a particular value of <t«. The graph indicates 
that if the aim-wander parameters remain unchanged, then it may be beneficial, sometimes markedly, 
to increase the dispersion of the bullets by increasing the gun dispersion, for by so doing the 
undesirable correlation between the bullets is reduced. 



flTg 


Fig. 10. 

Machine used to Compute Autocorrelation Functions 

In the various studies which we have undertaken, it has been necessary to evaluate an enormous 
number of covariances, and, as mentioned earlier, the labour entailed has been tremendously 
reduced by a special relay computer. For the initial development of a machine of this type, 
statisticians are indebted to Shire and Runcorn, who built a small model using punched tapes, 
relays and uniselectors, so that they could obtain autocorrelation functions from data which had 
been grouped to integer values in the range — 24 to 24- When the capabilities of this prototype 
were appreciated, a larger model’ was designed and constructed to our requirements by Weir and 
Barnes, at the Telecommunications Research Establishment of the Ministry of Aircraft Production, 
the range being increased from ± 24 to ± 63, the uniselectors being eliminated so that only 
ordinary relays arc used in the mechanism, thus achieving a more robust design and greater ease in 
maintenance, and various other refinements in detail being introduced. 

The resulting machine is constructed from standard G.P.O. equipment, and comprises six 
main parts — a keyboard perforator, two transmitters, a lamp display, control box and a computer 
rack — all illustrated in Fig. 11. Experimental data, in the form of sequences of numbers, are 
recorded in code as perforations on each of two tapes, which arc then read automatically by each 



82 


Cunningham and Hynd: 


[No. 1, 


of two transmitters. The machine then carries out the basic process of multiplying corresponding 
pairs of numbers from the two tapes and of summing the products, the final total being shown on 
the lamp display, and remaining there until cancelled by the operator. 

Originally, the numbers Were typed in code on the parchment paper tape used in G.P.O. trans- 
mitters, but it was found that a tougher material was required to stand up to constant usage, so 
that now all tapes arc made of Textuff, a cellulose-impregnated cotton. The typing is done by a 
keyboard perforator (see Fig. 12), which, for each operation of a key, can punch up to five holes 
across the tape. Each number is always coded by operating two keys — one for the tens of the 
number, and one for the sign and units of the number. The first key is selected from the first 
row of the perforator, on which the keys are numbered o, lo, 20 . . . 6o ; the second key is selected 
from the second row if the number is positive, and from the third row if negative — both these 
rows being marked o, 1, 2 ... 9. Thus -f 29 is obtained by operating the keys 20 and + 9; — 15 
by keys 10 and — 5; + 3 by keys 0 (in tens row) and + 3- The five holes, which are used in 
coding tens, represent the values 2, 4, 8, 16 and 32, in that order across the tape, so that the 20 key 
causes the punching of holes 4 and 16. In the case of the units, the first four holes have values i, 
2, 4 and 8, while the last hole gives the sign, being punched if positive, so that + 9 is coded by holes 
in the first, fourth and fifth positions across the tape. 2^ro must always be punched positively, 
since a negative zero gives no perforation and would be read by the transmitter as the end of the 
data on the tape, automatically stopping the computer. Besides the coding holes, the perforator 
punches guide holes for the tape-feed mechanism. 

The coding system described above does not make maximum use of the two rows of five holes 
each. Coding in powers of 2, and allowing one hole for the sign, it should be possible to code all 
numbers in the range ± 51 1, but the system adopted permits a very simple conversion from base 
10 to base 2, together with a simple keyboard perforator. Even with the present coding, numbers 
not exceeding J_ 77 could be punched on the tape, but the relays can only accept those between 
± 63. Conversion to the scale of two makes the data peculiarly well adapted for relay computa- 
tion, for then only two digits— o or i— are possible, and these may be made to correspond to the 
only possible states— open or closed— of a relay. Other relay methods of computation, not based 
on the scale of two, are generally less simple and less economical. 

The system of coding is illustrated in the following diagram. 


(a) Positive Units 






ib) 

Negative Units 



- 

rj 


guide 

00 

sign 





- 

ri 

4 

guide 

8 

sign 

! 






0 

i 0 








0 

0 





0 

11 = 1 




0 




—1 


0 




0 

+ 2 = 

2 




0 



2 

0 

0 




0 

1 3 --- 1 - 

i- 2 



0 

0 



-3 



0 



0 

-1 4 = 


4 




0 


-4 

0 


0 



0 

15-1 


4- 4 


0 


0 


- 5 


0 

0 



0 

1 6- 

2 

{ 4 



(.) 

0 


—6 

0 

0 

0 



0 

+ 7 = 1 

1 2 

-h 4 


0 

0 

0 


.„.7 





0 

0 

-f8 - 


8 




• 0 


-8 

0 




0 

0 

4-9 - 1 


4 8 


0 


• 0 


-9 

(c) Tens 







{ci) Typical 

Combinations 



«N 


00 

guide 

\o 

P! 




4-29 

0 

0 

• 0 

• 0 


20 tens 
+ 9 units 






10-2 




0 


0 


1 0 tens 

0 


0 




1 

8 

-15 

0 


0 


— 5 units 


0 



0 


20 - 

4 

4 16 






0 tens 

0 

0 

0 


0 


30 = 2 f 

44 8 1 16 

-f- 3 

0 

0 


0 

4 - 3 units 



0 



0 

40 = 


8 4- 32 




0 

0 

40 tens 

0 




0 

0 

50 = 2 

44 

4- 16 4 32 

+ 47 

0 

0 

0 

0 

+ 7 units 


0 

0 


0 

0 

60 = 

- 8 f 16 f 32 

+ 0 




0 

0 tens 
f 0 units 



The perforated tapes are fed through two transmitters, X and AT, from right to left, one row 
(I/IO in.) at a time, by a small star wheel which engages the guide holes (see Fig. 13). Originally 
designed for use on the Multiplex Telegraph System, these transmitters set up, on five wires, vol- 



1946] Random Processes in Problems of Air Warfare ^3 

tagcs corresponding to the various combinations of holes perforated in the tape, by means of five 
selecting ro^, mechanically connected to five tongues. The tongues move between two contacts, 
so that when the pins are at the limit of their upward thrust, the tongues rest against the “ make” 
contacts. Thus when a tape is passing through the transmitter, only pins opposite a coding hole 
can complete their upward travel, enabling their tongues to make contact, and those pins arrested 
in their motion by the tape maintain their tongues in the break position. 

the multiplication of each pair of numbers and addition of the product to the total takes about 
I second. If the two coded series of data on the tapes be denoted by a:2, . . . . . . and 

yij >'2» • • • yxi • • •» and if is required, then the tapes should be set in the transmitters so 

that when the first transmitter reads Xg from one tape, the second will simultaneously read yg 
from the other. When the electrical computer has accepted these two numbers, the tapes auto- 
matically proceed, and while Xg ui and yg^ i are being read off, Xg and yg are multiplied together, 
and added to the sum of products (which is always zero at the beginning of the operation). This 
sequence continues until the end of the tape is reached, or until the operator stops the process. In 
particular, when autocorrelation coefficients are being determined, then yg - Xg , g, where N 
may be given any desired value: the x and y tapes are then actually identical, but as they are 
passed through the transmitters, one is displaced by isf values relative to the other. 

Fig. 14 shows the lamp display — a series of powers of 2, from which the required product is 
obtained by totalling the illuminated values. In the same photograph appears the control-box, 
used in all normal operations. The ” Start ” key, on it, closes both the transmitter and multiplier 
circuits, and when returned to normal resets the whole equipment, cancelling the total and all 
storage except that on the display relays. The ‘‘ Stop ” key enables the operator to stop the trans- 
mitters while still retaining the total and storage— this is used to examine the total at an intermediate 
point on the tape, if required. On this installation the “ Cancel Total ” key has been blocked up, 
since it served no purpose that could not be performed equally well by the other keys. Its function 
is to enable the final product to exceed J: 524,287, by operating the ” Stop ” key before it is reached, 
displaying the total, recording it, and then cancelling everything with the Cancel Total 
obviously just as easily done by restoring the “ Start ” key to its original position. The amount 
so extracted is added to the final answer. The “ Display ” key in the “ Cancel ” position clears the 
display relays, and cuts off the lamps; in the “ Change ” position operates the display relays to 
correspond with the latest total, and in the upright position retains the reading of the last key shift. 

Of the four lamps shown on the control box, ” Excess Input ” and “ Excess Storage ” need no 
explanation. ” No Storage ” is a signal that the multiplying relays have exhausted all the tens 
and units stored in the addition relays, lighting intermittently in normal sequence, and continuously 
when the end of the tape is reached, or if the tape sticks owing to worn feed holes. The ” Test ” 
lamp operates when either the ” Test ” key, “ Change-over ” keys or ” Display ” key in the Test 
Panel are being used, and warns the operator that the machine is not available for normal use. A 
counter on the control-box records the number of products which have been evaluated and added 
together. 

The Test Panel is illustrated in Fig. 15, and its various keys ought never to be used unless the 
“ Test ” key is in operation. On it, two “ Change-over ” keys, C.O., will cut out the data from 
the transmitters X and AT, substituting data from the ” Numerical Input ” keys, designated ” X 
input ” and “ N input.” 

It is fairly easy to determine whether a fault lies in the transmitter or computer racks by replacing 
the data from the transmitters by that from the ” Numerical Input ” keys. In addition to the use 
of the Test Panel as a fault-finding device, the ” Input ” keys perform a useful function in finding 
the sum of a series of data. By using the + I key on the X-input and inserting the tape containing 
the series in the N transmitter, each integer will be added to the next, and the total shown on the 
display board. 

The ” Test ” key also brings into operation two keys starting the transmitting and multiplying 
circuits independently, a “ Display ” key corresponding to that on the operator’s box, a ” Reset ” 
key, and an ” Impulse Test ” key. The ” Impulse Test ” key allows manual control of the relays 
by the rotary switch (top right of photograph) which gives two impulse cycles per revolution. 

Fig. 16 shows the computer rack with relays uncovered. The letters on these relays give some 
indication of their function — thus in AUlV, the first of the relays, UN denotes that units are stored 
here from the N transmitter or input, while the first letter gives the numerical value, in this case A 



84 Cunningham and Hynd — Random Processes in Problems of Air Warfare [No. 1 , 

corresponding to 2® = i . Similarly B corresponds to 2\ C to 2\ D to 2®, and so on. In the same way, 
in BTN, TN denotes that tens are stored here from the N transmitter or input, and the numerical 
value is again indicated by the first letter B, corresponding to 2^, When NTF operates, the contents 
of the two rows of relays are added together, and the sum appears in the SN group {ASN^ BSN . . . 
FSN). The sign of the data is taken by relay SGN. A similar set of relays carries the data from 
the -^-transmitter. For example, the number -f 37 will be fed into the JV relays as 30 stored in the 
tens relays ETN, DTN^ CTN and BTN, and -f 7 stored in the units relays, CUN, BUN, A UN and 
SGN, When NFT operates, the two will add as follows : — 

f e d c h a 

1 1 I 1 0 Tens 

1 1 I Units 

10 0 10 1 Total 

in accordance with the addition laws in radix 2 

0 -f 0 - 0 
0 + 1 - 1 

1 4 1 - 10 
1 + I I 1-11 

so that + 37 is stored in the “ S ” relays, FSN, CSN, ASN and SGN. 

This demonstrates very well the simplicity offered by this system of converting numbers from 
base 10 to base 2. A sacrifice has, of course, been entailed in that maximum use is not made of the 
ten coding holes, which could have provided a range of ± 511, as mentioned earlier. 

The numbers so introduced into the SX and SN relays are now to be multiplied together. The 
sign of their product is first determined by relay SGS, which is open if SGN and SGX are both 
open or both closed, and closed if these two are in different states. Then, if ASX be closed, APP, 
the first of the “ Partial Product Selection ” relays (APP, BPP . . . FPP), is also automatically 
operated, and the latter then closes those of AB, BB , LB, which correspond to relays closed in 
the SN group. Thus if ASN, BSN and DSN are closed, the AB, BB and DB will close. 

Up to this stage the previous total of products has been retained in the ” relays (A A, BA, ,, , 

TA), and the sum or difference (according as SGS is open or closed) of the data in the “/f ” and 
“ B ” groups is now built up in the “ C ” set (AC, BC, . , . TC) : this new total is then transferred 
b^ck to the “/t ” group, the “ B ” relays are cleared, APP and ASX are released, and the next SX 
relay closes its corresponding PP relay. If this be BPP, then a similar sequence is carried out, 
except that this selection relay does not cause the closure of those “ B ” relays in exact correspond- 
ence with the SN set, but those “ B ” relays which are one power higher in the binary scale. In 
other words, data in the SN group, when passed by BPP into the “ B " relays, are staggered one 
place. Similarly, each of the other PP relays, closed in turn by the SX group, selects an appropriate 
amount of stagger, and the product of the number in the SN and SX sets is built up and added to 
the “/< ” relays, according to the following scheme : — 
kjhgfedcha 

10 1 1 1 1 47 in SN group. 

1 0 1 1 1 i in SX group. 

1 — no stagger, first partial product selected by APP. 

~~ one place stagger, second partial product selected by BPP, 
CPP not operated. 

— three place stagger, third partial product selected by DPP. 
I Total contribution to sum of products, equal to 517. 


Except at those intermediate stages in the multiplication which involve the adding of a partiaj 
product to the previous total, the data contained in the ‘U ” and “ C ” relay sets are equal to one 
another and to the most recent sum of products. By operation of the “ Change Display ” key 
this latest total can be transferred from the “ C ” set to the “ D ” relays, and on the display board 
these lamps light which correspond to the digits whose “ D ” relays are operated. 

During constat u^ge, the machine has shown itself to be extremely reliable. Although slower 
than an electronic machine, it has the advantage of being much less expensive and much more 
robustly constructed. 


10 111 
1 0 1 I i 1 

0 0 0 0 0 O' 

10 1111 


10 0 0 0 0 0 1 



1946] ^ Discussion on the Papers 

The evaluation of correlation coefficients is not the only computation which may be readily 
performed by this relay computer: it will also give an accurate estimate of the value of 

f(x)g{x)dx 

if numbers be perforated on the two tapes which are proportional respectively to the values of the 
functions f(x) and g{x) at close and regular intervals of their argument. The determination of the 
spectrum of a random variable as the Fourier cosine transform of its correlogram is therefore 
particularly easy — the correlogram is represented on one tape, while a series of tapes is maintained 
permanently available which represent cosine functions of various frequencies: thanks to the 
periodicity of the cosine functions, these latter tapes may be closed endless bands. To test the 
efficiency of the machine in this application, the function 

cos (X/ ; X - 0 00876 ; - 0 022848 

was represented by a tape of 70 consecutive values, and the cosine transform of the function then 
estimated. The values so obtained are shown in the following table, together with values of the 
theoretical result 


2 " 

5(co) = - I cos [it cos oitdt 
^ 0 


2 _ X (X-^ -f [i^ -H w2) 

7C - [i^ w'O* + 4xv"' 


CO 

... 0039270 

0 034148 

0*025335 

0*023800 

0*023100 

0*022440 

0*020668 

5(<o) estimated 

... 8*841 

14*205 

34*912 

37*128 

37*495 

37*544 

35*871 

Slti}) theoretical 

... 8*758 

14*479 

34*788 

37*150 

37*581 

37*569 

35*634 

CO 

... 0*017453 

0*015708 

0*013090 

0*006545 

0*003272 

0*000000 


5'(<o) estimated 

.. 28*372 

23*838 

18*179 

11*369 

9*866 

9*337 


5(w) theoretical ... 

... 27*985 

23*616 

18*254 

11*105 

9*736 

9*314 



References 

(1) A. C. Aitken: “ Some Applications of Generating Functions to Normal Frequency.” Quart. J. of 

Maths., 2 (1931), pp, 130-135. 

(2) S. Chandrasekhar: “Stochastic Problems in Physics and Astronomy,” Rev. Modern Physics, 15 

(Jan. 1943), pp. 1-89. 

(3) S. S. Wilks: Mathematical Statistics, Princeton University Press (1943). 

(4) J. Wishart and M. S. Bartlett : “ The Generalized Product Moment Distribution in a Normal System,” 

Proc. Camb. Philosophical Soc., 29 (1933), pp. 260- 270. 


Discussion on the Papers 

Mr. M. G. Kendall : I move a vote of thanks to the authors of the three papers with great 
pleasure. One of the primary objects of re-instituting this Section was to provide an oppor- 
tunity for workers in various fields to compare notes, and the papers we have had to-night form 
an ideal illustration of the value of such an opportunity. I should like to congratulate the four 
contributors and the Section on these papers. I should also like to add a word of congratulation 
to myself, because 1 claim the distinction of having worked out more of these serial correlations 
by hand than any other living worker, and I am glad to hear that that period of hard labour 
is now at an end. 1 hope that the machines about which we have heard may be made generally 
available for private workers. It is beyond the scope of any individual or the ordinary 
institution to acquire them. 

On the machines themselves I have only one question to ask, and that relates to the optical 
device described by Mr. Foster, ft is necessary in some classes of work to work out serial cor- 
relations with some accuracy to an order as high, perhaps, as the 50th, for a series which may 
run into a thousand terms ; and I am not sure how far such series can be photographically repro- 
duced accurately enough to enable calculations of such an order to be made. In other words, 
what is the instrumental error as compared with the sampling error? 

I have one comment to make on Mr. Foster’s paper. I was gratified to find that he had con- 
firmed, quite independently, the conclusions to which 1 came in dealing with economic series. 
He says that his conclusions are not quite so drastic as mine, but 1 do not think there is much 
between us. I found in an economic series that periodogram analysis showed a large number of 
periods for consideration, and concluded that the analysis is extremely misleading. Mr. Foster 



86* Discussion on the Papers [No. 1, 

would agree that if one tries to identify a separate period with each of the trial periods thrown 
up, they are not identifiable ; and says that if there are a number of periods of that kmd, it is 
indicative of some one disturbed period in the series. I do not differ from that, except that I 
would like the point settled by some experiments as follows : if someone would put mrough a 
periodogram analysis a few random series of moderate length, say 200 or 300 terms, and show what 
sort of periodogram was obtained from that result, it would be extremely interesting. I should 
not be surprised if there were quite a number of periods thrown up as significant in the ordinary 
meaning of the term. If not, then I think Mr. Foster is quite right, and that the existence of a 
number of these periods probably does indicate one single disturbed oscillation of the series. 

On the paper of Dr. Cunningham and Mr. Hynd 1 have one comment. I wonder if their 
technique is applicable to naval gunnery. When a ship is rolling there is some sort of approximate 
harmonic motion imposed on the range-finding apparatus, and it would be interesting to see 
whether there is any harmonic difference shown up in the correlogram, or whether there is the 
same kind of correlogram for guns which are more firmly fixed. 

I have left Dr. Bartlett to the last because I have more to say with regard to his paper. First 
of all, there are two comments on the question of standard errors. Firstly, the standard errors 
which he gives depend not on the observed correlations, but on the parent correlations. If the 
observed correlations are 100 times too large, as they may very well be, then the standard error 
given by the formula may be quite wide of the mark. It may be two or three times the true value. 
Dr. Bartlett has foreseen that difficulty, and suggests using an autoregressive scheme and fitting 
the first two serials which are more likely to be near the mark. Secondly, in most work that I 
have undertaken on correlogram analysis one is interested, not in the significance of any particular 
coefficient, but in the character of the series as a whole. In the correlograms in the paper the 
question is not whether the actual values are too large or too small, but whether there is any 
significance in the undulations as a whole. The indications are, as far as one can judge from 
experiment, that when definite sinusoidal movements appear in the correlogram they arise from 
some property of the series, and not from the nature of the analytical processes ; but one would 
like some test on the point. 

I should like to have said a good deal about the continuous random process, because 1 think 
there is some danger here of the mathematician running away with us to a slight extent. 1 have 
never been able to imagine any sort of continuous randomness ; it seems to me that it is essentially 
an idea of discontinuity. It may be that the discontinuity in the sort of series we observe is in 
the first or second order differentials, that it is the impulses and accelerations which change dis- 
continuously, not the actual values .of the series themselves. But I think that in economic series 
one must provide, even if the series itself is continuous in the sense that it proceeds through time, 
for random disturbances imposed on the series from outside. Yule has said, as 1 have, that the 
type of series we employ is not by any means the last word on the subject. It is admittedly only 
an approximation, and one must at some stage try to find a scheme which does provide for this 
more continuous process. To some extent I would have expected the generalization to proceed 
not so much from a consideration of the continuous random process as from that of shocks of 
finite extent which do not occur at regular intervals of time, but at irregular intervals, the random- 
ness lying in the time intervals between the successive disturbances. 

Perhaps I can add one remark on that. The scheme introduced by Yule has the advantage 
that it permits of the construction of experimental series. I cannot think of any method of con- 
structing a continuous random series, but one of the difficulties is to make sure, when analysing 
a series, that one knows its properties beforehand. In the Yule scheme one knows what one 
should find, and if it is not there, one can conclude that the methods of analysis arc wrong. 

Personally, I feel that this work which we have had put before us to-night has taken the subject 
a great deal further. We obviously still have a long way to go, but the signs are hopeful ; and I 
move the vote of thanks with great pleasure. 

Dr. H. E. Daniels : 1 am particularly pleased to have been asked to second the vote of thanks 
this evening. It was my privilege to work with Dr. Cunningham and Mr. Hynd for over three 
years and watch the development not only of the fruitful methods described in their paper to- 
night, but of other fundamental research initiated by Dr. Cunningham at the Air Ministry which 
I hope we shall hear about in the future. Mr. Foster’s paper is also of special interest to me, 
as his pioneering work on cotton-spinning has been known to us at the Wool Industries Research 
Association for many years, and it is good to know that his results and the ingenious machines 
he has devised for periodic analysis are at last being made available to workers in other fields. 

The problem of the target survival chance discussed in the latter part of Dr. Cunningham 
and Mr. Hynd’s paper is an interesting and fundamental one. It was outlined in a simpler form 
(without gun dispersion) bv Prof, Pearson in the discussion on Mr. Kendall’s paper to the Society 
last year, and Mr. Foster hints at it in a very different context in his reference to the distribution 
of yarn strength, which is determined by the behaviour of the weakest places in given lengths 
of yarn along which the strength is autocorrelated. 

The approach by Cunnin^am and Hynd is very general, granted the assumption of normal 



1946] 


Discussion on the Papers 


87 


residuals, as it places no restriction on the behaviour of the autocorrelations except that they 
should not be too far from unity, a condition which is usually satisfied in practice in their problem 
of short bursts. But in the case of yam strength, and possibly in certain other gunnery problems 
too, the intervals may be long enough for the condition of high correlation not to be met, and 
the Cunningham-Hynd series expansion is then not very tractable. 

Jn such cases an alternative approach might be tried along the lines of recent work on the 
theory of Brownian motion, if one is prepared to assume that the time series is generated by a 
simple autoregressive process. When there is no superposed error one has to solve the appropriate 
Fokker-PIanck equation with a reflecting boundary, a familiar though not yet completely solved 
problem in that subject. The theory can also be extended to include superposed random error; 
for example, in the simple, case of a one-dimensional Markoff* process, 

X f -= A 

if the survival chance in an interval dt is 1 - ^[Xyt)dt^ the Fokker-PIanck equation extends to 


where d^dt is the variance of the increment of x in dt^ and the required total survival chance after 

time t is I Pdx. A similar formula in radial co-ordinates would apply to the gunnery problem. 

00 

I have not been able to solve explicitly even this simple form of the equation, let alone the more 
complicated equations for higher-order processes, but numerical solution by relaxation methods 
might provide useful information in the region of moderate correlations not adequately dealt 
with by the Cunningham-Hynd series expansion. 

We are fortunate in having three new machines described to us for the calculation of serial 
correlations. For data provided in numerical form, the relay computer is in my opinion un- 
doubtedly the best of the three, but when the data are presented in the form of continuous records 
on a chart, the machines described by Mr. Foster seem more convenient, especially for textile 
work where high accuracy is not demanded. The Martindale optical instrument is the most 
rapid once the transparencies are prepared, but until a machine is available to perform automati- 
cally this at present rather messy operation, I prefer the integrating wheel. As it is relatively 
inexpensive to construct, would it not be a worth while improvement to arrange two integrating 
wheels to calculate Si -i* and Sa - SUi — simultaneously? The intra-class 

n c 

correlation r could then be conveniently obtained from each “ run through.” If the 

Si -i" S2 

practice is adopted of calculating correlations by dividing each covariance by a single variance 
computed from the whole of the original data, one is occasionally embarrassed by correlations 
which apparently exceed unity. 

Mr. Foster’s researches on irregularity in cotton-spinning inspired us at the Wool Industries 
Research Association to try out similar investigations on wool. There had been an impression 
that wool was ” better behaved ” in spinning than cotton, on account of its longer fibres, but our 
correlograms turned out to be depressingly similar to Mr. Foster’s, with periods even more in 
evidence, and Spencer-Smith’s parallel work on linen tells much the same story. This is not 
really surprising, since, as Mr. Foster will no doubt agree, the important factor affecting the 
amplitude of the drafting wave is not so much the average length of the fibres as their relative 
variability. 

I should like to ask Mr. Foster whether he has found any evidence in cotton series of what 
might be called a directional effect. From the theory of drafting outlined in his paper, one might 
expect the thickness to increase relatively slowly up to a point, and then to diminish rapidly as 
the tuft is pulled through the front rollers, producing a kind of saw-toothed appearance in the 
thickness curve. That such an effect may exist in wool is suggested by a tendency to skewness 
in the form of the frequency distribution of first differences, though the evidence is admittedly 
inconclusive in the absence of a suitable test of significance. It is perhaps worth observing that 
a directional effect of this nature is* produced in an autoregressive series when the distribution of 
the residuals is skew. 

1 have left little time to* comment on Dr. Bartlett’s valuable paper, which has clarified for me 
much of the hitherto puzzling behaviour of autocorrelations in short series. In particular, his 
standard error and correlation formulae for sample autocorrelations, crude as he admits them 
to be, are of considerable practical assistance in the interpretation of sample correlograms. There 
remains, of course, the question as to how far one is entitled to test individual autocorrelations 
for significance without making due allowance for selection, and perhaps Mr. Kendall had qualms 
about it when he suggested testing the correlogram as a whole for evidence of oscillatory move- 
ments. The fact that, as Dr. Bartlett points out, autocorrelations in short series are themselves 
highly correlated may minimize the importance of this, since there are effectively fewer inde- 



88 


[No. 1 


Discussion on the Papers 

pendent correlations to select from. To do the job completely would presumably involve the 
solution of a “survival chance” problem even more formidable than the Cunningham-Hynd 
one ! 

I have great pleasure in seconding the vote of thanks to the authors of these three stimulating 
papers. 

The Chairman then read a message from Mr. G. Udny Yule in which he expressed his regret 
at not being able to attend the meeting. He considered that the Research Section was maintain- 
ing a high standard with these three valuable papers, and wished to send his good wishes to the 
Section for a flourishing future. 

Dr. M. S. Bartlett then read a summary of the following written contribution from Professor 
P. J. Daniell, who expressed his regret at being prevented by illness from attending the meeting. 

My absence from this symposium is a grief to me. The subject is very important and inter- 
esting, and I send the following notes. 

The work done in America has been based on a fundamental study by N. Wiener of integrals 
in an infinite number of dimensions corresponding to the values of the fluctuating quantity at 
various instants. This work is not behind that of the Russian school in time or importance. 

In erratic fluctuations there will be no sharply marked frequencies, and formula (17) of Bartlett’s 
paper can be expressed in the form 

1 r* 

pW = cos ois <l>(soi)cio> (1) 


where 9^(50) is the spectral intensity. 

If y is the resultant of a stationary time-series of chaotic instantaneous impulse functions 
(Dirac’s 8 function), the spectral intensity is constant. If x is the response to these impulses by 
means of a “ linear mechanism,” then in the usual operational notation 

f(p)x ^ = y, i - \/(- 1) (2) 

In this case <^(ci>) for x will be proportional to 

[/’(/w)]-2 ^ [f(p)f{- p)y\ p -= /CO. 

The mechanism is presumed to be stable with decay and then the roots, a, of /(p) — 0 have only 
negative real parts. 

By contour integration from (1) it appears that the autocorrelation p is a time-function which 
for positive s is such that its Laplace transform is proportional to 

V. 1 

V(«)/(- «)(/»- a) 

summed for all roots a of f{p) = 0. 

This agrees with Bartlett’s formula (31), but it shows not only the differential equation satisfied 
by p(j), it also gives the particular solution needed. Moreover, in many cases, as with pure 
time-lag problems, f{p) corresponds to no finite differential equation. 

For convenience we standardize by multiplying by such a constant that p(0) = 1. 

The Laplace transform of the impulse function is 1. If l/f(p) is the transform of the dis- 
placement, X, caused in the mechanism by the impulse function at time 0, then the spectral intensity 
is inversely proportional to l/*(/co)]* and p(5) can be found from (3) or 0). 

The Laplace transform point of view has the advantage of combining mechanisms in cascade 
by ordinary multiplication. ^ 

There are also extensive examples from mechanical and electrical problems. 

Example 1 . Simple damping. 

f(p) ^ p + k 

^(<i>) ~ const. X (<a* -f A:*)"^. 

In equation (3) there is only one root, ~ k, and p(j) is therefore proportional to the function 
whose transform is l/(p 4- k). 

Hence 

p(s) -= exp. (- ks). 

Example 2. Electrical tuned circuit. 

Put 2k ^ 1/(£C). 

Then 

f{p) = const. X (p+ 2k + tOiVp 


(4) 



1946] 


89 


Discussion on the Papers 

If k ~ <^1 cos X, Wq = Wi sin X, 

the roots of/(p) - 0 are at 

a — M'l exp. ( i /X), 

and for each of these /(— a) - 4k. Hence the transform of p(.0 is inversely proportional to f(p) 
and 

p(5) = exp. {—ks) sin (X — cooO/sin X (5) 

Formula (30) of Bartlett’s paper arises similarly from taking 

f(p) = P'* + ap r P. 

In this case /( -- p) does not have the same value at the two zeroes of f(p\ so that p(j) is not 
quite so simple. 

Conversely, given p(5), and therefore <^(<o), it is possible to invent a mechanism — that is to 
say, an f{p), which would yield pis) under certain mathematical conditions. One can refer to 
Titchmarsh * using 

log <^(w) - log/(/co) + log /(- io>). 

[Note . — In example 2, if the circuit is over-damped we put ^ - toi cosh X, Wq =-■ Wj sinh X and 
replace sin by sinh in equation (5).] 

On the question of errors of estimates my contribution is slight in comparison with the work 
of Bartlett and Cunningham. My interest was aroused by a discussion as to whether in an 
empirical attack the autocorrelation or the frequency spectrum would be more subject to error. 
By elementary algebra it can be shown that the methods arc equivalent both in results and in 
accuracy if the proper formula of translation is used. 

Let x(t) be known at N instants of time separated by intervals h so that 

total time T Nh. 

Let 

CO — y integer from -- \iN — 1) to \iN — 1), 
assuming N to be odd. Then 

^(co) //j^lKO) I- (l — ly^virh) cos wr/ij (6) 

where <^(w) is the proper estimate of spectral intensity and v(t) is the estimated covariance of 
.r(/), .r(/ I t). This formula is an algebraic identity for any empirical set. Taking an infinite 
population of such sets 

0(6)) -- -f ."0 a p(rh) cos 

Errors in estimates of v and of 0 will exactly correspond. 

It happens, however, that the covariance of v(ri/i), v{rJj) is of the same order as the variance 
of either, so that the variance of 0(6)) is considerable, and does not decrease appreciably as N 
(or rather T) increases. Thus 0(to) appears to be more subject to error than v(t). However, the 
covariance of 0(6) J, 0(<oo) is small, so that the variance of a mean 0(6)) taken over a broad band 
of frequencies can be made as small as that of r(T). This suggests that in some cases an alternative 
method of estimation would be to run the data in suitable form through a mechanism, including 
perhaps lo or 20 vibrators damped to produce broad tuning and to determine the mean energy 
in each. If we knew the operator functions for these vibrating elements we could deduce the 
original spectral intensity and obtain p(.y) from the Fourier transform. 

Since the problem concerns correlation between many (in fact an infinite number of) variables, 
it is best to follow the usual custom in theoretical studies of correlation which is to assume normal 
distributions of variables. The fundamental lemma is then that for variables Vi, yo, .V 3 , V 4 the 
same or different, 

cov. (yiyi, V3.>’4) -= E{y^y^)E{y2}\) 4 Eiyo'A)Eiy2y’z) ( 7 ) 

It follows that 

cov. {v(.v), v(s + /)} -- _^^'--23X(M)[p(tt)p(M i 0 4- p(w — s)p{u + s h /)] . . (8) 

a ~ .y 

in which w is a multiple of h and X(//) equals 1 when - t^u^O, \(u) equals 0 when u'^T s - t 
and when — (T — .v), and in between X(w) varies linearly. 

If T is sufficiently great so that p(j) is small for values of s of the same order as T, then as an 
approximation we have Bartlett’s formula (6) in the form 

cov. {vis), V(5 f /)} [p(w)p(w -f /) h p(u — 5)p(tt + 5 4 /)]. 

This he kindly attributes to me, but I imagine it is well known. 

* E. C. Titchmarsh, Fourier Integrals, (1937), A131. 



90 


[No. 1, 


Discussion on the Papers 


If ^ is small the summation can be replaced by integration, but in no case should the factor 
T - sbe replaced by T. This factor corresponds to the a priori deduction that relative errors of 
estimates must increase as s approaches T. 

The expectation of v(0) is 1 , but it has as much variance as any v(t). 

On the subject of the variance of the estimated correlation, as distinct from covariance, 1 
have nothing to add to the more profound work of Bartlett and Cunningham. 

If we use the lemma (7) and assume h small enough to allow us to replace summations by 
integrations, 

var. = C\i^,T) f <9) 

and if wj — Wj - 2v is fairly small, w, 4- ““ 2ci> 

cov. (10) 

In these formulae 

^(<o) -= c(o),r)-/’ (i - ')?(/) cos w/rf/ (11) 

S(<o, T) - sir 6>r^// (1 2) 

If p{T) is small we can replace C{w,T), S(co,r) by 

^(co) — C(w) - f pit) coso)/f//, 5(cu) ^ f pit) sin oitJt. 


From (9) we see that the variance of is always greater than the square of its theoretical 
mean value however great T may be, but from (10) the covariance of neighbouring <^(o>) is small 
of order (where w is not small). 

If, beside possible fluctuations, p(5) decreases on the whole like exp. (— ks), and if we take a 
band of frequencies such that 

Ac*> -- Inn IT 

then the variance of the mean <^(ca) over the band divided by the square of the theoretical mean is 
of 'the form 


where 0 varies with w, but is generally of the order of 1. 
of frequencies 

mk 


with m between 1 and kT. 


Hence we should choose a band-width 
(14) 


Dr. Harold Jei-freys said that he had no experience in detecting empirical periodicities in 
geophysical data. He had a good deal of experience of failing to find evidence for them. In 
frequencies of earthquakes, for instance, the uncertainties found by the usual multinomial theory 
were much too low because earthquakes at a given place did not occur independently they might 
occur in batches up to thousands over an interval of a few months. A single series of shocks in 
one place seemed to follow a simple law of chance of the form dtUt I <y) with no superposed 
periodicity. 

He was a little surprised that nobody had mentioned that the Schuster criterion was identical 
with y® for two degrees of freedom; although he believed Schuster was first, Turner had a useful 
set of two-figure tables for harmonic analysis. 

In estimation problems he used inverse probability or maximum likelihood, but in geophysics 
it was usually found that the calculation became prohibitively long, designs could not be balanced, 
and the normal law never held. Therefore the maximum likelihood method was usually employed 
for the main features, but corrections were applied for minor effects. Preliminary examination 
was needed to find which were the major and which the minor effects, and this might be the longest 
part of the work. 

In the variation of latitude problem there were measures of the direction cosines of the earth’s 
axis at intervals of 0‘i year over 50 years. The free mojtion was presumed to be maintained by 
irregular disturbances as for Yule’s pendulum. The recurrence relations were 

L -- a/„_, - L cr 

• -- p/„ — I 4- 4 (5 

a = cos yT, p *— sin yT 


where 



1946 ] Discussion on the Papers 91 

If this was all, the deterihination of 0, and hence T by least-squares would be straightfor- 
ward. Dyson tried it, and found that this method led to an estimate of damping which was quite 
inconsistent with the actual persistence of the amplitude and phase of the free movement. The 
trouble was that the observed quantities were not the /, m of the equations, and that the error of 
observation was not negligible. In fact it was larger than a. One way of separating them 
was to use an equation for observations p intervals apart — p -- 24 worked fairly well. When he 
studied the variation of the amplitude by harmonic analysis he found that it had changed by a 
few large jerks, not by many small ones, as if the hits had been vigorous but infrequent. (There 
was no physical explanation for this at present.) There was actually an interval of about 15 
years when the motion seemed to show no irregularities but observational error, and he obtained 
what he believed to be the best solution by using this period alone and taking a zero. This reduced 
the problem to the other extreme case, which was not difficult. 

The other problem was codeerned with the mass of the moon as determined from the motion 
of Eros in the 1930-31 approach. These perturbations were not quite harmonic, but they were 
proportional to a known function of the time, /(/). The serial correlation came in because the 
positions of the comparison stars had errors, part of the errors affecting all the stars in a region. 
Sir H. Spencer Jones had dealt with the problem, but Dr. Jeffreys thought he had made some 
improvements. He determined the mean residual at each time when fit) was zero, using residuals 
over about hve days about those times, and interpolated to get an allowance for star errors as 
nearly as possible independent of fit). Then separate estimates of the coefficient for each interval 
from one zero to the next were made, and a general estimate was obtained by combining the 
results by least-squares. The variation of these estimates among themselves provided an estimate 
of uncertainty. The errors of the datum values would be expected to produce a correlation of 

i between successive estimates, but this was small enough to be taken into account by a small 
correction. 

Mr. J. E. Moyal said that he had been interested for several years in continuous time series 
or, more generally, random functions of time of the type mentioned by Dr. Bartlett, because he 
was interested in the application of statistical methods to physical theory ; in physics it was the 
continuous type of random process that prevailed rather than the discrete. The notion of a 
random variable is generally sufficient in the equilibrium problems of statistical mechanics; the 
extension to the notion of random function of time becomes necessary in fluctuation problems — 
e.g.y Brownian motion, electrical fluctuation ; in the theory of non-uniform states ; in the statistical 
theory of turbulent motion in fluids. He was particularly interested in Dr. Bartlett’s approach 
to the formidable problem of estimation in this type of continuous process ; as Dr. Bartlett pointed 
out, this problem is by no means of academic interest only, since it is possible to devise physical 
instruments which will give a measure of various types of time averages, and a solution of the 
estimation problem is necessary to discriminate between them and choose the most efficieiit 
instrument in the measurement of a given quantity. An example is the use of an electronic 
analyser to measure the spectrum of autocorrelation functions in electrical fluctuations and 
turbulent motion. This is required to give the mean square of the Fourier components of the 
random process ; the output meter used could be made to give readings of time averages of square, 
absolute value or peak amplitude; the solution of the estimation problem leads to the obvious 
choice of the instrument giving the time average of the square as the most efficient. Dr. Bartlett 
mentioned that the introduction of an uncorrelated impulse function in his equation 26 was not 
rigorous because it was not then possible to define the derivative of the velocity. It is possible 
to render Dr. Bartlett’s expression rigorous (a) by using the theory of the conservation of momentum 
and energy as in the classical theory of impulsive motion, (/?) by supposing the random impulses 
do possess correlations lasting over a period which is short compared to the period of the velocity 
autocorrelation {i.e., such that the impulse autocorrelation tends to zero within a period during 
which the velocity autocorrelation is still nearing unity). 

Major J. M. Hammerslfy said that Dr. Bartlett had discussed the “effective” number of 
degrees of freedom appropriate to the estimates of variance of the autocorrelations of a time 
series. He mentioned the use of the factors 

1 in connection with the mean, 

1 in connection with variance and covariance. 

A recent official publication, prepared by Major Bayley and the speaker, enlarged upon this 
aspect of autocorrelation theory. Before giving a summarized version of the results of this 
publication, it was necessary to sketch in the nature of the problem which confronted them. 

In trials of anti-aircraft equipment, the instruments under trial produced a continuous set of 
data. For example, when a radar was tracking a moving target it necessarily fed to the predictor 
a continuous record of the estimated co-ordinates of the target; and these co-ordinates were 
continuous functions of time. For practical reasons, however, these data were only recordca 
by cameras for analysis of the trial at discrete points of time. Usually the interval of time between 



92 Discussion on the Papers [No. 1, 

successive camera recordings was constant for any one set of recorciings — the errors of the 
radar might be photographed by a cine-camera running at i6 frames/sec., or by a single-shot 
camera operated by a timing unit, at intervals of, say, 2 seconds. The performance of the equip- 
ment was then assessed by calculating various statistics (such as arithmetic mean, standard devia- 
tion, etc.) from these sets of recorded data. 

As a rough statement it might be said that the closer together in time were the recordings, 
the more was the information yielded about the behaviour of the equipment. On the other hand, 
the greater was the amount of computation necessary to produce the required result. A com- 
promise was therefore required between economy of computing time, photographic materials, 
etc., on the one hand, and amount of information in the result on the other hand. Bearing this 
in mind, the problem was “ What was the best time interval to choose between successive 
observations? ” 

Their approach to the problem was as follows: It was well known that the standard error 
formula; for the unbiased estimates of the mean and variance of a set of n independent observations 
were respectively 

var. (a:) — var, (s^) — 1). 

(The latter formula assuming a mesokurtic distribution of the .v’s.) These formulte did not, of 
course, hold when the readings were correlated in time ; but they might then define two numbers 

and n* such that 

var. (x) = aVwfc*, var. (s^) = Ics^ jin* 1). 

These quantities //* could be expressed in terms of n and the autocorrelation coefficients of the 
series. Their formula for corresponded to that given by Mr. Yule in J.R.S.S., CVIII (1945); 
while that for //,,* did not appear to have been given previously. 

In the formula for var. (s^), was defined as the unbiased estimate of variance. As such it 
differed slightly from the ordinary definition for independent observationjj, being in fact 

n„*in — 1) 
n(n* T) 

times the usual definition. 

If they supposed that the observations were evenly spaced in time along a continuous series, 
the number of observations, w, would be inversely proportional to the time interval between 
them: and the labour of computation would be directly proportional to n. They might then 
define the efficiency of the computation by the percentage ratios 

El, 100 /;,*//;, E, - 100 n*/n. 

Thej; had prepared methods and tables for evaluating w*. Hence they had been able to assess 
the efficiency of the computation. On comparing this with the computation labour required, they 
could arrive at an optimum time interval which would give a workable compromise between 
efficiency and effort. 

In preparing their tables they had assumed that the autocorrelation coefficients could be ex- 
pressed in the functional form 

p(.s) cos >5 

It might be mentioned that in first solving the problem for dispersion they confined their 
investigations to the standard error of the mean square error about the parametric mean. For 
this case they introduced the quantity which differs slightly from 

Major G. V. Bayley said that in following Major Hammersley he would confine his remarks 
to the practical application of the methods and tables which they had prepared. 

Several occasions arose where much was already known about an equipment, but where further 
trials involving the measurements of the mean and dispersion of errors were necessary'. He would 
discuss firstly the case where ^ome sample correlograms relating to the equipment were already 
available. The first step was to estimate, in effect, the constants in the formula : 

p(5) ' cos A.V 

which would produce a reasonable “fit” for the observed values of r,. Other formulae for p 
were sometimes more appropriate. 

Knowing the average length of a series, they then calculated n and //* from the tables for a 
selection of time intervals. Finally an “ optimum ” time interval between observations could 
be chosen for subsequent trials. This would depend on whether high reliability was required — 
/.e., large w* — or high efficiency — i.e., a large ratio of n* : n-~or whether a virtual independence 
of observations was required with /;* approximately equal to n. 

Frequently, however, a sample correlogram was not available. But often a more or less 
reliable value of the period of the correlogram was known. They therefore examined n* and E 
for series with inherent autocorrelation of specified types. They reached a number of tentative 
conclusions : 

Firstly, as the time interval was reduced, n of course increased, but n* varied in a very different 
manner, reaching maximum and minimum points. Finally a stage was reached where it was 



1946] 


Discussion on the Papers 


93 


pointless to reduce the time interval further. Here large increases in n produced relatively small 
increases in /?*. 

Secondly, n and might differ appreciably. It was usually impracticable to quote either a 
standard error or n*^ for the estimated statistics. Nevertheless they felt that to quote n alone 
might be entirely misleading. 

Thirdly, for the mean, differed from for dispersion, w** might be several times greater 
than n, but n^* was never greater than n. The order of these differences depended on the damp- 
ing of the correlogram and the relation of the time interval to the “ period ” of the correlogram. 
These conclusions, on the reliability of the mean might be compared with some similar observa- 
tions by Mr. Kendall in J.R.S.S., CVIII, p. 96, paragraph 11. 

Finally they were able to recommend values of the time interval in terms of the correlogram 
period, which would give high efficiency. They also indicated which values should be avoided. 

A report had been issued setting out these methods in detail, and they would be glad to submit 
it in condensed form as a written addendum to this discussionf. 

They had mentioned this elementary but fundamental problem in autocorrelation, not only 
to quote another instance of its practical application in gunnery, but also because they found 
that this device, the conception of w*, enabled them to represent their difficulties in a not very 
easy subject, to those who had little time to study statistics in detail. 

In conclusion, there was one other point to which he should like to refer. Dr. Cunningham 
and Mr. Hynd mentioned in their paper that there might be a wide field of application for an 
instrument which would evaluate, between limits, the integral of the product of two functions. 
An obvious example which occurred to the speaker, was the calculation of the large number of 
joint life annuities and other functions, required by actuaries. There the age separation of the 
two lives was analogous to lags in the autocorrelation coefficient. Might he, therefore, emphasize 
the great value of some figures, or even an opinion, indicating the accuracy of such instruments, 
not only in their present application but in other spheres of research ? 

Dr. Hartley wished to say a few words about the comparative efficiency of the calculating 
machines mentioned by the speakers. A description had been given of two machines specially 
desired for the calculation of serial correlations, ‘and it might be of interest to compare their 
efficiency with that obtained on standard commercial machines currently produced and serviced. 
Under certain conditions the equipment most suited to the calculation of autocorrelations, if 
carried out on a large scale, consisted of a Hollerith Tabulator, a Reproducer and Sorter. The 
method and the plugging would be very special, and could not be given here. 

To fix the idea they would consider a definite example of a serial correlation calculated from 
50, two-figure observations. The Post Office relay-machine would cope with this accuracy, 
its working speed being about i second per product formed. It would therefore produce a serial 
correlation of this kind in about 50 seconds. The Hollerith equipment would form and print in 
some 80 seconds 12 sums of products representing 12 serial correlations. This worked out at a 
theoretical speed of about 7 seconds per serial correlation. Allowing for some contingencies, it 
would appear that the Hollerith installation would be about 6 times as fast as the Post Office 
relay-machine. 

The comparison was not a very fair one, because the Hollerith equipment mentioned would 
represent a larger installation. Nor was it easy to make any comparison of cost, because the 
Post Office machine was a single machine, specially made for the purpose, and the Hollerith 
equipment could not be purchased, it had to be hired. Nevertheless he would ask the last speaker 
whether his department had considered the possibility of using this standard Hollerith equipment. 
For even if the total amount of work in the department did not warrant an installation of this 
sort, there were possibilities of enlisting its service on an ud hoc hire basis. 

With regard to the optical machine described by Mr. foster, the comparison was reversed. 
Its time was faster than that of the Hollerith Tabulator. On the other hand, the accuracy of 
the optical instrument was limited. The time to draw the curve would appear to be larger than 
that taken to punch numerical values of its ordinates on to Hollerith cards, but if, as happens 
in Mr. Foster's work, the graph is plotted directly by the recording instrument, this time would 
be virtually eliminated, and in these circumstances the optical instrument appears to be particularly 
efficient. Nevertheless, he would suggest that Mr. Foster had been a little unfair in stating that 
the time taken by his optical instrument was negligible compared to that needed in arithmetical 
calculation of serial correlations. His time, 5 seconds, was not very Tnuch shorter than that of 
7 seconds, the theoretical time taken by the Hollerith. For the purpose for which Mr. Foster’s 
instrument had been designed it seemed to be admirable, and it would be difficult to find anything 
suited better for this particular purpose. 

Mr. Stone said he ought perhaps to make it clear at the outset that, through nobody’s fault, 
he only became aware that he should be invited to speak at this meeting about twenty-four hours 
before. He hoped, therefore, that he would be forgiven if his remarks were addressed to the sub- 

t This addendum will be printed in Part 11 of this Journal. 



94 Discussion on the Papers [No. 1, 

jcct generally rather than to specific points raised in the very interesting papers which had been 
read that afternoon. 

There seemed little doubt that the type of analysis which had been the subject of these papers 
was destined to exercise a profound effect on those sciences in which it was claimed that cycles 
had been found empirically, but that these cycles were ones for which no satisfactory explanation 
could be obtained. It might well be that the natural sciences would be more affected than the 
social sciences, since in economics theories involving exact periodicities had for some time, he 
thought, been on the decline. The picture presented by an autoregressive scheme with appropriate 
coefficients was one of a system with a propensity to oscillate regularly, but which was constantly 
being disturbed by random shocks. This seemed to be a good representation of economic systems 
in the large, and, looking back, it was perhaps surprising that the search for exact periodicities had 
been so intensive in this field. 

Another aspect of the present representation was that dynamic models of economic systems 
must be stated in stochastic terms, and the success of prediction from such models must depend 
on the relative importance of the disturbances compared with the regular influences at work. 
A recognition of this problem had led to much interesting work on systems of stochastic simul- 
taneous equations with which the name of Haavelmo was particularly associated. This work 
threw light on the dangers of single equation systems for the calculation of structural coefficients 
where, as was usually the case in economics, the single equation formed part of a set of stochastic 
simultaneous equations. It seemed to the speaker that this field of work arose from a similar 
view of the phenomena under investigation to that adopted in the autoregressive approach, and 
a matter which might perhaps fruitfully be discussed by the Society on another occasion. 
There was a final point he would like to make at the risk of stressing the obvious. To say 
that the variables of a system were subject to random shocks as well as to the regular oscillatory 
tendencies inherent in the system was not the same as to say that those variables if subjected to 
factorial analysis would show large specific factors. The reason was, of course, that the dis- 
turbances might be incorporated in the common factors. The speaker had recently completed a 
factorial analysis of the components of the national income and expenditure in the United States 
of America over the short period 1922-38 (Barger’s data), and it appeared that over 97 per cent. 
^ finance of these components could be explained by three factors which might be identified 
with the national income, the rate of change of the national income and time. He was not sug- 
gesting that this simple analysis would necessarily be equally successful in explaining variation 
if It were possible to analyse data covering a longer period. But the analysis did perhaps indicate 
the close-knit character of variation in economic systems where the variables might be supposed 
to be influenced by random disturbances as well as by a regular mechanism of change. 

made it necessary to close a most interesting discussion 
which had ranged over a wider variety of applications of the subject introduced to the meeting 
than he could ever remember happening on any other occasion. He would invite the openers to 
reply to the discussion, but imagined that they would probably prefer to reply through the medium 
of the Journal. 

The following written contributions were received after the meeting : — 
b/LT. W. R. Buckland : May 1 take this opportunity to thank Dr. Cunningham, and his 
paper and for the pleasant contacts during the war on the subject of 
bo*^bing. A possible derivation of the term is interesting : OBOE- -Hautbois high wood 
Mosquito,” an all-wood air-framed aircraft, with its high speed and altitude characteristics. 
The figure of 30,000 feet for the release altitude which appears in the results at Fig. 8, when 
coupled with the optical nature of the radar beam, completes the picture in this respect. An 
essential feature of accurate bombing, or target-marking, is an uninterrupted undeviating run over 
the target. This yields a relatively more simple statistical position to the defences. However, 
the relatively high speeds of the ” Mosquito ” gave it an excellent opportunity to cheat the defences 
m the target area, and thereby tended to minimize the effect of the .)•(/) component. In fact, we 
arc told at 211 that is 30 yards only (for experimental data from P.F.F. aircraft). 

With regard to a^, it is of interest that, even though more extensive smoothing does mean 
spme loss of instantaneity in the estimate of the forward velocity of the aircraft, the value of <7„ 
given at 4*74,tn p.h. is only of the order of per cent, of the forward velocity, and probably not 
greatly significant in view' of the better bombing which results for values of a greater than 30' . 

In considering Fig. 8; with a value of a at 67 - 70% and using two of the Case (ii) curves 
where the smoothing period is constant T 20 seconds), although, as the paper points out, the 
bombing error for 1 ^ 20 seconds is lower than for t o seconds, it is clear that the theoretical 
bombing error (a,) is of the order of 300 ‘ 25 yards. From this it is but a short mental step to 
appreciate the enormous advance in aerial warfare which is known as ” The Battle of the Ruhr,” 
1943. Aircraft could operate and successfully bomb targets which had hitherto been protected 
by that cousin of London’s fog, RUHR haze, as well as by the more usual natural protection in 
the form of cloud. 



1946] Discussion on the Papers • ' 95 

The climax of the use of the OBOE system was in the successful bombing of the myriad tactical 
targets prior to, and after, June 6th, 1944. On the eve of the D-Day assault heavy coastal defences 
were subjected to a pounding through 8-<)/ioths of cloud. So effective was this, that the con- 
centrated fire upon the landing beaches never materialized and the defences failed to carry out 
their allotted task in the enemy’s plans. For this type of target it was necessary to have a very 
high degree of accurate geographical location (to i/iooth part of a minute), based principally 
upon the work of the photographic interpreters on the various Intelligence Staffs. 

Dr. J. L. Spencer-Smith : Mr. Foster's work on cotton slivers and yarns has been very similar 
to ours at the Linen Research Institute on flax slivers, although there are certain differences. We 
may find disturbed periodic variations as opposed to oscillatory variations such as the drafting 
wave. For these the periodogram works well enough as a means of determining the period let\gth, 
but our object was also to measure the amplitude of the variation. For this the periodogram was 
of little use because the amount of phase disturbance was not the same in all slivers. We were 
therefore led to the serial correlogram method, and here we met the added difficulty that owing to 
the length of flax-fibre strands, some of which may be 30 inches long, there is a very pronounced 
secular trend. Also for physical reasons the concept of a damped harmonic motion maintained by 
disturbances was untenable. It was to meet this type of series that I made the analysis to which 
Dr. Bartlett referred, and I agree with him that the concept of independent disturbances of ampli- 
tude, phase, and trend has only a limited application, although it may be a useful preliminary 
approach to a series when little is known about the physical processes which generate the series. 
It seems important, however, that such a series can produce a correlogram which does not differ 
significantly frorn those of the series of Mr. Kendall and Dr. Bartlett.* I think this supports Mr. 
Foster's emphasis on the need for a physical approach to every time series, because the serial 
correlogram of any unknown series may be reproduced by several different physical processes. 
Thus for example we can reproduce the serial correlogram of practically any economic series on the 
basis of a textile sliver, although I do not suppose that the processes are the same. The reason for 
this is that serial correlogram does not specify the time series completely. Some other data are 
required : the frequency distribution of the whole scries may suffice, but I believe we may also have 
to consider both the form and the variation of the frequency distribution of short runs of consecutive 
terms along the series. 

If the same correlogram can be produced by different physical processes the question arises: 
Will a significance test calculated for one process hold for the serial correlogram resulting from 
another process ? In other words, is there any general significance test for the serial correlogram ? 

Dr. Bartlett, in reply : As I mentioned at the end of the meeting, I only partly agree with 
Mr. M. G. Kendall’s remarks on my own paper. I agree that much remains unsolved in the 
theory of time-series — for example, the problem of goodness of fit of the correlogram ; I also 
agree that we must be careful not to allow some of this mathematical theory, which has its 
most exact application in physical problems, to run away with us when we are applying it in 
the less exact sciences. On the other hand, I think it important to realize the generality oi some 
of the theory 1 have described. The phrase “continuous time-series’’ is perhaps misleading, 
for the only essential continuity required is that the series exists continuously through time, and 
the shocks to which it may be subject arc in fact of just the kind which, if 1 understand him 
correctly, Mr. Kendall is suggesting should be considered. Of course one obvious condition if 
a statistical analysis of any kind is to be possible is that no shock shall be of a unique and “ epoch- 
making ” character, such as, for economic scries, the outbreak of war or of a general strike. 
Dr. Jeffreys mentions a problem where he was lucky enough to have a length of series with no 
shocks at all, but 1 suspect that the times when we can be certain that such a situation exists will 
be very limited in number. With regard to the construction of experimental continuous series, 
1 pointed out that such series, with predictable theoretical properties, have been recorded in 
physical problems, and such experimental records can be used for theoretical studies. The 
Cowles Commission for Research in Economics have, 1 think, constructed and studied experi- 
mental series of this kind. 1 seem to recollect that Mr. Kendall, in his spoken remarks, raised 
also the question of superposed error, and queried the value of measuring it by observational 
differences. This reference of mine to what is done in practice (see last paragraph of section 6 
preceding the discussion of Wolfer's sunspot numbers) was not intended to be one of approval; 
1 indicated that the effect should more properly be estimated by its depression of the correlation 
coefficients. 

While I do not altogether agree with Professor P. J. Daniell's statement that “ it is best . . . 
to assume normal distributions of variables ", I found his contribution most interesting, especially 
in its emphasis on the close relation, even for the sample^ between correlogram and periodogram 
analysis, so that if the proper interpretation of either analysis is made, there should be no question 
of getting results by one method not implied by the other. 

The problem raised by Cunningham and Hynd of the survival chance from a number of shots 
with correlated errors is a generalization of the problem of a salvo of rounds with a common 



96 


Discussion on the Papers 


[No. 1 


aiming error ; this case, which they call the “ shot-gun ” case, is important in 
a lot of study during the war, but the more general problem, for which it provides me upper 
limit, 1 have not seen discussed elsewhere in any detail before. . r 

Dr. Spencer-Smith’s query about correlograms is related to the unsolved problem ot goodness 
of fit already referred to. For a stationary scries a correlogram does specify the character or 
the series completely if the process is normal, but not otherwise. But 1 have noted that tor a 
wide class of time-series the sampling errors of the correlogr^tm are, to the first order, dependent 
only on the correlogram and not on other features of the distribution, to which any test ot 
goodness of fit should therefore be insensitive. That is something, but not very much, tor we 
cannot hope to estimate the true correlogram at all accurately unless we can specify it by one 
or two unknown constants, and this means specifying the underlying mechanism of the series. 

Finally, besides recording my personal thanks to the various speakers in the discussion, 1 
would like to make two comments myself on the formulae in my own paper. 

(i) I have referred in connection with equation (25) of my paper to Campbell’s theorem, 
but the latter is usually restricted to random impulses of the simplest type characterized 
by the number occurring in any finite time interval being a Poisson distribution. Equation 
(25) is more general, being associated with the homogeneous random process, which may 
be regarded as a linear superposition of impulses belonging to such a simple type. 

(ii) 1 have confined myself to single time-series, but there is no theoretical difficulty 
in discussing by similar methods the joint distributional properties of more than one series, 
e.g. of two coupled series which, even when only of the first order, can exhibit similar 
oscillatory properties to the single series of the second order discussed in my paper. 

Such an extension will help to link the present work with the work on simultaneous stochastic 
equation!^ referred to by Mr. Stone. 1 have worked out some of the corresponding correlational 
and sampling formulae, but in view of the length already of the printed contributions to this 
symposium, 1 do not propose to record them here in detail. 

Mr. Foster, in reply: I should like to comment on Drs. Daniel's and Hartley's summing 
up of the various methods for calculating correlograms. The advantages of the optical method 
are (a) speed and (d) that it will deal directly with traces from a recording instrument. The work 
on cotton slivers, which involved some hundreds of correlograms, would have been practically 
impossible by any of the other methods. (The planimetcr integrator, for example, was specially 
designed for this work, but was found to be much too slow and laborious). 

When the data are in numerical form (/>) becomes a disadvantage; but, if considerable 
numbers of correlograms are required, the speed of the method is such an advantage that it 
would be worth while to make a special machine to plot the data in the required form. This 
could either be photographic as suggested in the paper, or, better, could punch or cut the 
observations into opaque cards. 

When, however, only small numbers of correlograms are needed, the relay computer or the 
planimeter integrator have obvious advantages for dealing with numerical observations, and 
Dr. Hartley's Holbrith method, for those who have access to the equipment, combines these 
advantages with a very high speed of calculation. My comparison of the optical method with 
arithmetical calculation referred to calculation on the ordinary keyboard machines. 

It is difficult to give a precise reply to Mr. Kendall's question about the calculation of high 
order correlation coefficients by the optical method, because the method has not been pushed 
to the limit of its accuracy, and also because it has so far only been used on the traces from 
recording instruments. With some refinements in design I think it should be capable of 
calculating correlation coefficients to within 0.01 on a long scries of discrete observations. In 
the work on cotton slivers instrumental errors were negligible compared with the variations from 
sample to sample of the same sliver. This would, I think, apply also to most other material to 
which correlogram analysis is likely to be applied. 

Dr. Cunningham and Mr. Hynd, in reply : In our paper we have tried to cover both the theoretical 
foundations of stochastic work, and also all the practical applications to military science which 
have come to our notice, and which have been carried through to a successful conclusion. So far 
as we are aware, therefore, the methods of autocorrelation analysis have not been applied to naval 
gunnery. There is, however, still much research of this nature to be done in the field of armament, 
both in improving the efficiency of the weapons of the lyesent and in designing those of the (future. 

In the problem of survival chances against a rapidly firing gun, we had given some consideration 
to an approach involving an adaptation of the Fokker-Planck Differential equation, comparable 
with the suggestion of Dr. Daniels, but difficulties were encountered, and the method which we have 
presented in our paper proved more fruitful. Considerable progress has also recently been made 
in obtaining inequalities which will determine, within reasonable limits, the survival chance for 
rather long bursts of gunfire, when the convergence of the series (3*25) is not satisfactory. 

Being a digital machine, the relay computer is in one sense perfectly accurate. It is, however, 



97 


1946] Discussion on the Papers 

necessary to group the data to the nearest integer in the range ± 63 ; in general, this has a negligible 
effect on the correlogram, but if the grouping be very drastic, it is possible to introduce corrections 
analogous to Sheppard's corrections, which arc valid when certain plausible assumptions are satis- 
fied, and which have been worked out by Dr. Daniels while he was working with us at the Ministry 
of Aircraft Production. 

In comparing the relay computer with the Hollerith, we have formed the impression that the 
data take much longer to be prepared on the Hollerith cards than on the punched tapes which we 
use ; on the other hand, actual points on the correlogram are evaluated more rapidly by the Hol- 
lerith, once it has reached that stage in its processes. In consequence, the relay computer is the 
speedier as long as the number of points desired on the correlogram is small (perhaps less than 
1 ; 5) compared with the number of data in the sequence to be correlated, while the Hollerith is the 
better under other conditions. 


SUPP. 


VOL. vni. 


NO. 1 


E 



98 


[No. 1, 


Sequential Sampling Formulae for a Binomial Population. 


By J. P. Burman, B.A. 

Introduction, 

In this paper we shall be concerned with the schemes of sequential sampling, which have 
been developed by A. Wald^ in America and G. A. Barnard^ in his recent paper (see p. 1 of 
this Supplement), The primary object is to develop exact and workable formulae for the 
operating characteristic and average sample size of open schemes, and to give simplifications for 
small fractions defective. 

Section 1. 1. 

The sampling procedure described by Barnard is as fellows : — 

Start with a score Hj. 

For each non-defect sampled, add 1 . 

For each defect, subtract b. 

Accept the batch when the score reaches 21 1 - H^. 

Reject the batch when the score falls to zero or below. 

It is assumed that the fractions defective which the scheme is designed to distinguish are reason- 
ably small, so that the appropriate value of b is large (usually at least 10) and may without in- 
convenience be taken as an integer. Naturally we may then suppose H i, Hg integers without further 
restriction. In practice the batch size will be finite, but it is supposed so large compared with the 
number normally sampled that it may be treated as infinite and the rectifying effect of sampling 
may be ignored. 


Section 1 , 2. 

Let the score at any moment be x and w, be the probability of the score reaching 2H without 
previously dropping to 0 — that is, the chance of acceptance of the batch. If during sampling the 
score again reaches ;c, the chance of acceptance is again for the past history of the score is clearly 
irrelevant. 

Now u^ -= (chance that next observation is a non-defect) x (chance of acceptance when score 
is {x 4- 1)). 

-f (chance that next observation is a defect) x (chance of acceptance when score is 
{x - b)). 

Thus //, qu^,x^ PUx-b (^ = 1, 2, 3 . . . (2H - 1)) (i) 

where p - fraction defective, q \ — p. 

This is true as long as no critical score has yet been reached ; hence the restrictions on x. We also 
have : 

//o = w-i = . . . = = 0 (ii) 

"2ir 1 (iii) 

The acceptance score is exactly 2H, That on rejection varies from 0 to (— 6 4- 1). 

Now ^ - I 

Hence Wg = - i 

q ^ 


Ws = ~ wj = ~ lit and so on up to : Wm i ^ «i* 

The term then begins to operate and the formula becomes more complicated. It is 
necessary to guess a solution. 



1^46] Burman — Sequential Sampling Formulae for a Binomial Population 

Consider the function : 

The general term = (- 1)' ~ ^ , 

where ^ = (x - /r/» - l)(x - Kb - 1) . . . {x - Kb - K)IK ! 

The series continues as long sls x — Kb — K > 0 


99 


Fix) - qF(x 4 1) - 



a: 


"b + V 
Kb - 1\ p'^ 
K ) 




If F(x -f 1) contains one more term than F{x) then the latter may have a term formally added to 
it which has a zero factor. K^ — integral part of 


Fix) - qFix -1 1) - 
since the first term disappears. 



~ Kb - \\ p^_ 
K - \ ) 


P -(•« -2b- l) + . . . I = pF(x - b). 

since when Kiib 4- I) < jc 4 1 

. (ATi - i)ib -h\)<x-b 
Fix) - qFix -I- 1) 4- pFix - b). 

So far F has only been defined for a: > 0. 

Let F(ac) — 0 when x < 0. 

Then Fix) satisfies conditions (i) and (ii) for i/,. 

Fix) 

also satisfies (iii). Hence it is equal to 

From which we obtain the operating characteristic or initial chance of acceptance 

_ F{H>^ _ FiH,) _ 

FilH) F(//i \ h\) 

since initially x = //o. 

F(x) is most conveniently written as : 

q-^ \ \ - ix - b - \)pq^ f- ~ 2* ~ *) 


(iv) 


Section 2. 1 . 

Now let Vx -- probability of rejection when score is x. 

If it is assumed that p + g ^ 1 , of course, v^. 1 — w,, but a solution will be required with p 

and q as independent constants. Conditions are : 


(i) V, - , 4- (1 <a:< 2//~ 1) 

(ii) Vo -= = ... - V. ft 1 -= 1. 

(iii) - 0. 


Now 


Vs = 

Vs = 



p 




,v ? • • 

and again it is necessary to guess the ar 


P 

<l 



JOO Burman — Sequential Sampling Formula for a Binomial Population [No. , 

Try V, = qF{x)vi - p{F(x - 1 ) + F(x - 2) . . . + F(x - 

This is a linear function of the F’s and therefore satisfies condition (i) for x> b. Moreover, it 
is so chosen as to satisfy (i) for 1< jr < h. We can make it satisfy (ii) by definition, since the trial 
solution has only been dehned for j: > 0. 


(iii) will be true if 0 — Vzb = qF{2H)vi — pGilH) 
where G(x) = F(x - \) ■\- F{x - 2) ... + F(x - b). 


Then 


^pCaH) 
“ q F(2H) 







. . (v) 


Section 2. 2. 

Now let be the probabilities of acceptance and rejection in exactly y more steps when the 

present score is x, 

satisfies (i) u^^ = qu^ ^ + Pi^^-b.y - 1 

(ii) Wh) -= 0 (1 < X < 2f/ — 1) 

(iii) u,^ = 0 (— A -1 1 < < 0, all y) 

(iv) U 2 a,y = 1 (y 0) 

- 0 (y > 0). 

(i) is true for j ^ ^ 

i y > 0 


Let uft) = which is the generating function of the sample size for acceptance. 

y-O 

Multiplying (i) by /*' and summing from y ^ 1 to infinity, 

uft) = ufO) 4* qtu^M(t) -1 ptu,.ft) 
and since ufQ) — 0 (1 < jc < 2// — 1) by (ii), 

(i) ' //,(/) == qtu^ , i(/) -f ptu^ f,t). 

The boundary conditions become : 

(ii) ' //o(r) w_i(r) . . . = w-ft. i(/) -= 0. 

(iii) ' Wa„(/) - 1. 

Thus uft) obeys the same three conditions as with the replacement of p and q by pt and qt. 
The same argument relates vft) and v^, but the solutions to be used must treat p and q as inde- 
pendent and not use p V q ^ 1 . 


Hence uft) 



vM - G(2H. t) - G{.x, O} from (v) 


where 

Fix, t) - (qt)-' { 1 - ix-b - Dipq^F *)+(■* ” 2* ~ *) . 

} 

and 

G(jc, /) ^ F{x — z, /). 

z « 1 


Finally 

I<«M) + v„.U) - P'C(2//, o} - ptG(H„ t) . . 

. . (vi) 


is the generating function of sample size, since initially x == /fa- 


Section 2. 3. 

The average sample size A{p) (say) is obtained from differentiating (vi) with respect to t and 
putting / 1 . 



•1946] Burman — Sequential Sampling Formulte for a Binomial Population 101 


'<(/») = p + G,(2fo| - p |c(/fs) + G,(J^ 2 )| 



where FM = ^ - Kb - K)i^ ~ ~ (- 1)* ^{ptfy 

and similarly Cj is defined. 

Now it follows from the recurrence relation (i)' satisfied by F(x, t) that 
Gix, t) == pt . G{x — b,t) + qt, G(x -f* 1, /) 

= pt . G{x — /), 0 + • <jix, t) -f qt{F{x, t) — F(x - b, /)} 

G{x, t) - - C(jc - b, t) -\- ^-^^f{F{x, t) - F(x - h, /)| for x>b. 

Repeated application of this relation leads to the following: 

G(^. 0 - r-V/ + (F^- ys { ! 


pt 


CIt 


F(x - 2b, t) \- . . . 


jj>ty 
0 - at) 


^ where 0 < x — Kb ^b. 


■■■ I (r 

Hence G{x) - ^ ^^F(,x) — ^ (viii) 

Multiplying G{Xy t) by pt, differentiating twice with respect to t and putting t 1 in each case, 
p few -H C.(Af)} = + |(H p)F{x) 

vI\f{x- b) \- F(x- 2b) I - . . (ix) 


p {2G,(a:) + C-iW} --= qF,{x) ^ p)F,(jc) f- F{x) 

+ H[iF(x -h) + 3F(x - 2b) H . . . -f- (K -h \)F(x - Kb)] 

P“ I J 

^ - h) + F^(x - 2b) + . . . I- Fiix - Ar^>)l - ^ ^ . . (X) 

Substituting (viii) and (ix) in (vii) leads to: 

"'IWh) - b) I F(2// - 2h) . . . t F(2// - K'b)) - (K’ + 1)| 

- ^ \a(F(H^ - b) I FUh -2h) ... + F{H^ - Kb)) - (/f + I)} . . (xi) 

where 0 -r 2H — K'b < h 
0 < Kb <4, b. 


Section 3. 

By differentiating the generating function (vi) twice with respect to / and putting / 1, we get 

the second factorial moment of the sample size distribution The variance ^ 

where [l'i A(p)j the average sample size. 

Substituting (viii) (ix) and (x) in tx'(j 5 ) it is found finally that : 


*7 



VF(2H) 


where 


- *) + F{2H - 2b) .. . 

+ F(2H - K'b)) - (AT' + 1)} + j, Q(2H) - i Q(H,) 


(xii) 


0(x) ■= 2^{2FU - b) + 3F(x - 2b) + ... i 

+ 2pa{F,(x - b) + Fi(x -2b)+ . . .}- {K + Df/f + 2a) 


and 0 < X — Kb ^b as before. 



102 


Burman — Sequential Sampling Formula for a Binomial Population 


[No. 1, 


Section 4. 

For many industrial applications the working range covers only very small values and 
i/jy, is negligible except in this range. This is the case when Hi, Hz, and b are large. We write 
Ht Rib, Hz - Rzb, pb=^ X and let A oo , keeping Ri, Rz, fixed so as to find the limiting 
properties of this type of scheme. 

Put 1 - p - e-^ + 0(p*) - + 0(6-2) 

- 1 ) = 

with a similar approximation for (Hi H- Hz). 

Thus F(Hi) has the limiting form F^fX). 

Where F,.(Ar) - e**' (l - - \)Xe-’‘ + X^e-'^X ) 

Uh. has the limiting form u{R„ R^, X) — 

~^has the limiting form A{Ri, Ri,X) 

= >- + ~ 

- + F„,.,(X) + . . . + f«. AX) - (AC + 1)} 


Summary. 


Formulae are obtained relating to a sequential scheme with starting score Hz, winning score 
(Hi H” H^, and penalty (for a failure) 6, For fraction defective p. 


Probability of acceptance u„^ 


HHi-V Hz) 


where 

F(//)=-^-"|l -iH~ b - IW f (^" 2*~ ') (”~f~ • • •} 


the series terminating when the binomial coefficient vanishes or the largest factor in its numerator 
ceases to be positive. Formula (xi) gives the average sample size A(p) and (xii) the variance of the 
sample size distribution. 


The limiting form for where b tends to infinity, /?, 

\-(Rz- \)Xe^^ 

u(Ri,Rz.X)^e''«^^ . 

I - (Ri f Rz - \)Xe-^ 


is constant (/ - 1, 2) and X — pb '\%: 


r _i- 2)! x‘e 


A(p) also has a limiting form if measured in multiples of b. From these formulae (either in their 
limiting forms or not) the corresponding curves of W//, and A(p) plotted against p may be drawn. 
These are known respectively as the “ Operating Characteristic ” and the “ Average Sample Size 
Curve,” and they give a fairly complete idea of how the scheme will work in practice, enabling 
it to be compared with other sampling inspection schemes. The calculations for the variance of 
the sample size are rather more complicate^ but not entirely impracticable. 

Tables of percentage points for the operating characteristic and of average sample sizes for a 
considerable range of open and closed schemes will shortly be published by Mr. F. J. Anscombe.® 
The fundamental tables of the F function are available from Dr. H. O. Hartley of the Scientific 
Computing Service. 



1946 ] 


Burman — Sequential Sampling Formula for a Binomial Population 


103 


Acknowledgments, 

The author wishes to thank Mr. G. A. Barnard for drawing his attention to the problem, and 
Mr. F. J. Anscombe for advice and encouragement throughout, and in particular for the idea of 
considering the limiting forms when b is large. 

All this work was done while the author was at the Ministry of Supply, and thanks are due to 
the Chief Scientific Officer for permission to publish. 

References 

^ Wald, A., “ Sequential Tests of Statistical Hypotheses,” Ann. Math. Stai., 17 (1945). 

* Barnard, G. A,, ” Sequential Tests in Industrial Statistics," Supplement to J. Roy. Stat. Soc. (1946). 

* Anscombe, F. J. (To appear in the Supplement, Part 11, 1946.) 



104 


INo. 1, 


Some Properties of Closed Sequential Schemes 
By C. M. Stockman and P. Armitage 

In a recent paper, Barnard ^ has outlined the theory and development of sequential sampling. 
In a sequential sampling scheme the sample size is a random variable, but it may sometimes be 
desirable to fix an upper limit to the sample size. Such a scheme may be called a closed sequential 
scheme. We shall consider Wald sequential schemes closed by the condition that if no decision 
has been reached by the time a sample of a certain size has been taken, the batch will be accepted 
if the score is then greater than some number, z, and rejected if the score is less than or equal 
to z (using the method of scoring introduced by Barnard). Our notation for the quantities defining 
a scheme is the same as that used by Barnard, except that his H and H' are replaced by Hi and // 2 , 
so that the scoring procedure becomes : 

Start the score at 

Add 1 mark for a good item. 

Subtract b marks for a bad item. 

Accept if the score reaches Hi 4- /fa* 

Reject if the score reaches or falls below zero. 

This paper gives a method of obtaining the operating characteristic for any closed scheme. 
The limiting case as b — oo and Ri — Hi lb and — H^lb are constant is dealt with, and a method 
is given of evaluating the average sample size as a multiple of b, for any fraction defective as a 
multiple of 1/6. This limiting form is more convenient to use both from the point of view of 
numerical calculation, and also because it lends itself more readily to theoretical investigation. 

Anscornbe ^ has discussed the application of sequential sampling schemes, and has tabulated 
the operating characteristic and average sample size curves for various closed schemes which 
were obtained by the methodsVf this paper. He has also discussed the use of the limiting form 
by substituting a finite value for h, 

1 . the lattice diagram. 

For a description and explanation of the lattice diagram, the reader should consult Barnard’s 
paper. Some of the more important dimensions of the lattice diagram for a Wald scheme are 
given in para. 4. I . 

We shall first prove two simple but important properties of the lattice diagram : 

(i) If the sampling is random, the probability of reaching (x,y) 

-Np^(]-py, (1) 

where N is the number of paths to (x, y) which are not interrupted by the acceptance and rejection 
boundaries. We shall call TV the nurriber of admissible paths (or simply the number of paths). 

♦ For all orders of the x non-defpets and y defects are equally likely, 

Probability of reaching (x, y) ^ N x (Probability of reaching (jc, y) by any one path). 

Now the probability of reaching (.x-, >') by getting x non-defects followed by y defects 
^ pKi - pY, where p is the fraction defective. 

.’. Probability of reaching {x, y) ~ Nf^(\ — p)'. 

(ii) The number of paths from 

(/» g) to (/'I- x, g 4 y) if no paths are interrupted by the boundaries — ' . (2) 

For the number of paths is equal to the number of ways of choosing the positions of the x 
non-defects in the sarnple of (x 4 - y), or the positions of the x unit displacements parallel to Ox 
in a total of (x 4 - y) displacements, == * ' *'C,. 

If a Sequential scheme is closed at a certain sample size, the two boundaries on the lattice 
diagram are joined by a diagonal line, some of the points on which will be assigned to the acceptance 



1946] Stockman and Armitaqe — Some Properties of Closed Sequential Schemes 105 

boundary and some to the rejection boundary. The object of the following method is to find the 
number of admissible paths to the acceptance points, both on the original acceptance boundary 
and on the diagonal cut-off, and thence by (1) the probability of reaching each of these points. 

2. Exact method. 

An inner boundary is drawn on the lattice diagram — i.e.^ one which keeps at a distance of 1 
unit from the acceptance and rejection boundaries and coincides with the axes before the boundaries 
start, as shown in Fig. 1. 

The area contained by this inner boundary is composed of a preliminary part (Af i and Afa in 
Fig. 1), followed by a repetitive part {A and B in Fig. 1). In any Wald sequential scheme the 
repetitive part of this area can be divided into two blocks of the shapes shown in Fig. 2, which will 
be called A and B respectively. The lengths of the horizontal and diagonal sides of these blocks 



Fig. 1. 




Fig. 2. 

depend on //i, H., and A; and it is shown in para. 4. 1 that when -f H^l{h + 1) is integral 
the block B vanishes, making the solution simpler. Now, with each block is associated a matrix, 
the (/, ./)th element of which is the number of paths from the /-th point of the L.H. diagonal side to 
the y-th point of the R.H. diagonal side (counting from the top), all of the elements being binomial 
coefficients, according to (2) above. Further, the matrices may be multiplied together so that the 
(/, y)th element of the product of two adjacent matrices, S (s^j) and T (t,j), is the number of 
paths* from the i-th point of the L.H.S. of S to they-th point of the R.H.S. of T. 

For this number of paths 

=--= S Sij^tjtj where k refers to the k-Xh point of the R.H.S. of S or the L.H.S. of J, 

= where ^ ST. 

The area before the repetitive blocks start — i.e., on the left of the first A — can also be divided 
into blocks of a similar kind, so that the elements of the corresponding matrices are binomial 
coefficients as before. The blocks will not in general be of the same size or shape as A and 
and their number will depend on Hi, and b. The first of these “ irregular ” matrices will, of 
course, be a row matrix, the elements of which give the numbers of paths from the origin to the 
points on the R.H.S. of the first block. 

For convenience, the symbols A, B, M, Y, etc., will be used for both the blocks and the corre- 
sponding matrices. 



J06 Stockman and Armitage — Some Properties of Closed Sequential Schemes CNo. 1* 

Example. 

1 Fig. 1 represents the scheme having = ^2 = 25, ^ = 10. The repeating blocks are A 
‘ and B, and the irregular blocks arc Mi and A/*. 

Defining M = M1M2 and Y = AB, the elements of the row matrix MY^ will be Ihc number 
of admissible paths from the origin to the points on the diagonal line representing a sample of 
19 + 11/1. 


We have Mi — 

(•c. 

^Ci 

1) 



( 28 

8 

1) 



M 2 - 

/»c. 

1 

0 



f " 

1 

0 






1 

0 ^ 

1 1 

55 

11 

1 

0 ) 




”C 2 

“C, 1/ 

1 

\165 

55 

11 

1 / 


A - 

/•c. 

1 

0 

® \ 


/ ^ 

1 

0 




f 'Q 

•Cl 

1 

0 1 


1 

6 

1 

o\ 



! ‘c. 

•C2 

•Cl 

1 


20 

15 

6 

1 

6/ 




•C3 

•C2 

*cj 


V 15 

20 

15 


B- 


0 

0 

® \ 

i — 

/ 1 

0 

0 



1 

1 

0 

0 1 


f 5 

1 

0 

0 ] 


1 


•Cl 

1 

0 


10 

5 

1 

0 



V‘C, 

•C2 

•Cl 

1 / 


\ 10 

10 

5 

1 / 

whence 

A/- 1 

(913 

171 

19 

1) 






and 

y- 

/ 11 

1 

0 









' 55 

11 

1 

o\ 








, 165 

55 

11 

1 








'325 

155 

45 

6/ 







MY - ( 22908 3994 425 25) 

MY^ - (549908 94092 9794 575) and so on. 


The numbers of admissible paths to the acceptance points on the lower boundary, (//i, 0), 
{Hi + b, 1), {Hi 4- 2h, 2) etc. are the last elements of the row matrices M, MY, MY'^, etc., and so 
from (1) the probabilities of reaching these points for any fraction defective can be calculated. 

Tt will be convenient to make the diagonal cut-off coincide with the R.H.S. of one of the blocks 
A or B. In the first case the maximum sample size will be of the form Hi -h n{b + 1), and the 
numbers of paths to the points on this cut-off will be given by the elements of the matrix MY^A. 
In the second case we shall use the elements of the matrix M T" (as in the Example above), but 
the maximum sample size cannot be expressed so simply as before. Again the probabilities of 
reaching these points may be calculated from (1). 

The total probability of acceptance for any fraction defective can thus be found, and the 
operating characteristic may be obtained. The result will of course depend on which points of the 
cut-off we assign to the acceptance boundary. 

The calculation of the average sample size would involve finding the probabilities of rejection 
at a large number of rejection points ; although this could be done by a matrix method, it would 
involve a great deal of work. It is suggested, therefore, that the limiting form given in para. 3. 2. 
be used as an approximation. 

3. Limiting form as b 00 . 

3. Operating characteristic. 

Suppose that b -> co in such a way that Hijb = Ri and Hzlb = R^, where Ri and are 
constants. Then it clearly makes no difference in the limit whether the lattice diagram represents 
Hilb — Ri and HJb ~ R^ or Hi/ib -f 1) Ri and H^Kb + 1) = Rz- We shall consider the 
latter case, for, as has been pointed out in para. 2, and is proved in para. 4. 1, the block B then 
vanishes if Ri -f Rz is integral, and we need consider only A as repetitive. 



1946] Stockman and Armitage— 5om« Properties of Closed Sequential Schemes 107 

Fig. 3 shows the diagram for b = 10. Ht = 22. In the general case for = /?, = 2, 
we have : 

1 ) 

1 0 \ 

1 

and MY^ - 4 - 0(6"), + 0(6"-i)). 



Now ai^n ■ 1 ^ «i.« -H ^2.nl2 ! 4 a3,«/3 I, and similarly a 2 ,„+i, , i may be expressed as linear 

functions of ag.,, and a 3 .„; the coefficient of the highest power of b in any element depends 

only on the coefficient of the highest power in elements previously obtained. 

We may thus write 

A/-(6V2! 6 1) and y-/ 6 1 0\ 

[ 6 V 2 ! b 1 

\6»/3! 6V2! bJ 


and by successive multiplication obtain 

a^.nb% 

The same procedure may. clearly be followed for any value of and and we shall have 
'when /?i and are integers). 


Ml - (b^^ b^*2^ 

\RiV (/? 2 ~ 1 )!’ 


b, 1, 0 

6V2! 6, 0\ 


\ M- 

'(«a+ 1)! 


M„ ~ / b, 1, 

/ 6V2 ! 


0 

o\ 


1 */ 

\(/?7 + 2 )! '7 


(in = /?i — 1) 




108 


Stockman and Armitage — Some Properties of Closed Sequential Schemes 



h. 1 . 

b^n \ b. 



[No. 1, 


I ' j 

\(/?, -h - 1)! / 

whence A/ ~ . . . A/„ /;, ^ j 2 (Rx + R 2 '- 0 

and Af y” - ^ - 1) r 1, 2 (/?i -f /?2 - D 

(see para. 4. 1 ) 

(If /?a is not an integer, we must replace by -- 1, except when Rx + R^ occurs, and m = 
/?i — 1 by w - /?i + /?a ~ yx‘ See para. 4. I.) When writing down the matrices, the reader 
will always find it helpful to draw the lattice diagram for the particular scheme. 

In para. 4. 1 we prove that the number of paths from the origin to the point {xb -}- x\ y) Nb^^ 
where x\ y and N are finite. (If this notation is compared with that in para. 1, it will be seen 
that X and N have been replaced by xb -h x' and NA*', so as to keep x and N finite.) 

The probability of reaching this point is, from (1) 

~ * where the fraction defective = Xjby 

-> NX^e~^\ a finite quantity. 

If the scheme is closed after one of blocks A — ix\, at a sample size of k(b 4 1) where {k — Ri) 
is integral, by a method similar to that used in the exact case the pro bab ilities of reaching the 
{k — Ri -1- 1) acceptance points on the lower boundary . . . (/?i 1 /? /?i, /?), p 

0, 1 ... (A: — Rx), and the (Ri f /?2 — 1) points on the diagonal cut-off . . . (kb — R,^ + r, 
it -f- /?2 — /•), r = 1,2 . . . {Ri -{ Ri — 1), may be found in the limiting form, and the operating 
characteristic obtained. As in the limiting form for the open scheme, the fraction defective is given 
as a multiple of 1 lb. For the closed scheme, the result will depend on which points of the cut-off 
we assign to the acceptance boundary. 

3. 2. Average sample size 

In calculating the average sample size it is necessary to take into account not only the proba- 
bilities of acceptance at a finite number of points, but also the probabilities of rejection at an 
infinite number of points. However, all the points on one segment of the rejection boundary, 
on the top of, say, the {n } l)th block A, contribute to the A.S.S. a term, S^, of order b, and a 
formula is given below for 

If t„ points on the diagonal cut-off are assigned to rejection, then the last t„ segments of the 
rejection boundary will have to be modified. Fig. 4 will make this clear. This shows the accept- 



X 


Fig. 4. 


ance and rejection boundaries for 6 = 10 , JTi =-= 20 , = 24 , where the scheme is closed at a sample 

size of 53{b -f l)/ll and the diagonal cut-off adds 1 extra point to the acceptance and 2 to the 
rejection boundaries. Now clearly the rejection boundary can be drawn along the line y — 5 
(as shown by a dotted line), for if a path reaches this line it must eventually cross the real rejection 
boundary. 


1946 ] 


109 


Stockman and Armitaoe — Some Properties of Closed Sequential Schemes 
In general, 

...... ( 3 ) 

r = < 

where the modified boundary is a distance / below the original one—/.^., t -- /?i -t- /o ^ 

- 1 - jrVH), 

^^'r.n == (^1 + W ~ IK.n I {r “ 
and ar,n is defined as in para. 3. I. (Proof of (3) in para. 4. 2.) 

The formula (3) may also be used for the contributions to the average sample size from the 
segments of the rejection boundary to the left of the first block A, If there are m irregular blocks 
Afi, M 2 , . . . M„j, before the first A (if is an integer, m Ri— 1), then Mi Mg , . , Mj 
will be a 1 X (/?i i R^ — m — \ ^ j) matrix; and the value of for the segment on the top of 
^ ^ U . m — \) is given by (3) with n — (m — j) and 0, r = (i?i -f R 2 h w), 

. . . . (/?! 4 - - 1 ). 

For the first part of the rejection boundary, « = — w, and all the a's are zero except a^i > /?, -m- i.-m» 
which equals 1, since all paths start from the origin. 

Thus the average sample size in the limit as b — >■ 00 

~ 2 : 5,. -f 2^ jhp, i kb 2 : Qi, 

n~--,m i = /?i <“-l 

where the are the probabilities of acceptance at the {k - i?i 4- 1) acceptance points on the lower 
boundary, and the qi the probabilities of reaching the (/?i -h /?2 — “ 0 acceptance points on 

the diagonal cut-off. As in the limiting form of the open scheme, the average sample size is given 
as a multiple of /?, for a fraction defective as a multiple of 1 jb. 

Note on computation of S^. 

For all modified schemes with the same upper boundary (/.e., n — t constant), (the index of 
X) 4 (the suffix of is constant. Hence when such functions as ‘ have been 

computed once they may be used many times. 


4. Mathematical appendix. 

4. 1 . Properties of the lattice diagram. 

Fig. 5 shows the acceptance and rejection boundaries for a scheme defined by //|, and b 



Fig. 5. 


Xu X 2 f yu y 2 are defined as shown, and Qz is the first right-hand end point of a rejection segment 
to fall to the right of Fu 


no 


Stockman and Armitage— Properties of Closed Sequential Schemes 


By the definitions of i/i, b we have 

which give 

a:. = /fi 


byi — Xi = Hi 1 

Hi < byt - xil 


yi 


~\b/ 


60-1 - 1) - Jfi < 


‘•-Kt)-"- 

Hr + 


>'j - 


< 


where <jc> is defined as the smallest integer not less than x, 

i,e.y <jc> ^-= X if a: is an integer 
\x] + 1 otherwise. 

Putting Hi -- Riib |- 1), //* -= Rzib i- 1), we have, from (4), 


;ci - Ri(b -f 1) 

A, - ~ /?,(* 1 1) 


yt 


^Ub+ 1)\ 


(Ri -f R2)(b “{' 1)' 
b 


->> 


[No. 1, 

. (4) 


(5) 


The block B vanishes when I- R 2 is an integer. 

For the sample size at Pj == ^ + h 

and the sample size at Q 2 yz H -^2 + b times (the number of complete lengths b between and 
the y-axis). 

Since the y-co-ordinate of the boundary increases by 1 for every length b parallel to the x-axis, 
this number is ya — yi. 

/. Sample size at Qt — Sample size at 

= y2 4- ^^2 4- b(y2 - yi) - Xi — b - 1 

-- (b + _ (/f, + /?, + i)(fe -I- 1). from (5) 

= (6 -i- I) (R, + /f2 + d). 

The necessary and sufficient condition for the block B to vanish is that the sample size at some 
Q = the sample size at P*, 

/.£•., that Pi H Pj is an integer. 

For sample size at Qa = the sample size at Pa, we need also Pj 4- P 2 < b. 

Points on the diagonai PoQn* 

We shall consider the case where Pi -f Pa is an integer and Pi 4- Pa < ^ so that Pn, G„ are 
on the same diagonal, and from (5), ya = Pi + P 2 + 1. 

The co-ordinates of P» are Pi(6 -j- 1) -j- (/i — 1)^, n — 1. 

The sample size is thus (Pi 4- /i — 1)(6 -f 1) and the co-ordinates of are 
(Pi h /I — l){b 4- 1) — (ya + /I — 2), ya 4- /I — 2 
i.e., (Pi H" n — 1)6 — Pa, Pi 4" P 2 4~ n — 1. 

The points on the R.H. diagonal side of the /i-th block A are therefore 
(Pi 4' n — 1)6 — Pa 4' /*, Pi H~ Pa 4- — 1 — r, r = 1, 2, . . . , (Pi 4* Pa — 1). 

Number of paths to any point (5, y)) between the boundaries is 0(b*?) in b. 

(We shall need the result only for integral Pi 4- Pa, and Pa< 6, for which’ case the proof 
holds). 

For suppose that the number of paths to the i-th point (^ 1 , v]f) on any diagonal line through a 



1946] Stockman and Armitage — Some Properties of Closed Sequential Schemes HI 

corner point of either boundary is Nfim + Oibm ~ i). Then the number of paths to the y-th point 
(5j. of the next diagonal 

- [NM + o(brji - 1)]6 + 1C,; - nt 
Vi<nj 

■■ .,5., K + 

- O(bnj). 

Now, the number of paths to the point (b k ~ yi -\- I, - ^) (1 < A: < yO, on the first 
diagonal, is simply which is Oib^^ '*), and the result follows by induction for all points on 

the diagonals. 

It follows immediately that the number of paths to any point (5, r)) between the acceptance 
and rejection boundaries is 0{b*i), for the number of paths to the point (5, tq) is a non-decreasing 
function of 5 for fixed yj, which is 0(M) for the greatest and least admissible values of 5. 


4. 2. Proof of Oy 

The numbers of paths to the points on the line P„0« are given by the elements of the matrix 
A/y”. We shall denote these by Pr ,,^ r = 1, 2, . . . -h /?2 — !)• 

From para. 3. 1 we have/?^,n = i «,-r i^l 

We shall now consider the contribution to the average sample size of the rejection boundary 
between and i - 1 (excluding itself), and in particular the limiting form, 5,„ of this as b — > oo . 
We shall require the following lemma : 






( 6 ) 


We denote I -- + . . . + X'lrl) by <f>,(X), and notice first, by using Leibnitz’s 

theorem on the repeated diflferentiation of the product of two functions, that 



(- \Y 
X' "( r! ' 


Now ^ ~ ~ )’ (Geometrical Progression). 

We differentiate r times w.r.t. X and multiply by . 


Then 


1, X\> 

1 

4- 

1 

i 

i 


' b) 

\dxj 1 X ! 


We therefore have to show that 


But 


Lt. 


\iix) I 


\ - (\ - Xjhfj^ L- 
X ■ ■ X 


- A'/A)‘+M ! dv [\ - 

X / ■■ \dxj \ ~X /• 

where P(X) - - (l 


b 

b) / X 


can be expanded as a power series in X with coefficients of 0(6”) and radius of convergence b. 
Hence for lA'l < 6' ' : 6 it is bounded, and (6) follows. 

We may now calculate 5„. For greater generality we consider the modified form of the scheme 
in which we reject, not on the line Rz + n, but on y' — /?i -1- /?2 + /J — /. 

Our rejection points arc therefore 

{Ri -I n — 1)6 — i ?2 + y» Ri -{ Rz n t (1 y ^ 6). 

The number of admissible paths to this point from the diagonal point 


(R, 1 n - 1)6 Rz h r, R^ + R^ \ n - \ - r is 


(since we must pass through {Ri \ n — 1)6 — Ro f R^ + R^ + n — t — 1), and the contribu- 
tion to the average sample size is 


ijj + — 

V 


1 • 
Pr,n 


(' 


(Bl I «- D* BtU IX\ »!'*•: 

t ) ibj 


{(/?. 


f «- 1)6 -t- iR, -f n -t-y-/|. 


r » i 



1 12 Stockman and Armitage— Properties of Closed Sequential Schemes [No. 1 , 

Summing over y, we have 

The summation over j is 

A +('•-'+ iy-“^Cr. .l}, 

which as b -> x , is, by (6). asymptotically equal to 

(R, -I « - +- (r - / + 

Hence the contribution to the average sample size is asymptotically equal to 

A'*- f(/f, I n - , (r - r + 

• /»«■ ' *■ • » ' 2 1 "" ' X'-‘ J ’ 

which, since asymptotically equal to 5„, where 

5, -= Ae-'"*'- ‘'-f*' AT*- 

- Ae-'"* " . A-** ‘ |{/f, + ;i - Da,., + (r - 

Defining a\ ri as (/?i 1 /i — 1 K „ + (r — /K i.n, (/ < r < i?i + i^a), we have 




/< ckno wlcdgments. 

Since this paper was completed, the authors’ attention has been drawn to a paper by Bartky,® 
in which a very similar matrix method is introduced. Bartky obtains the operating characteristic 
and average sample size curve for an open scheme defined by Hi, and h (in our notation), 
with the restriction that (Hi 4 - H.^l(b + 1) is an integer, so that sampling is carried out in blocks 
of size (b 4' 1). This makes no difference to the operating characteristic, but does affect the 
average sample size owing to the difference in the rejection boundary. The solution involves an 
inverse matrix, for the evaluation of which an approximation is given. Although Bartky con- 
siders only open schemes, a solution for closed schemes is implicit in his method. He uses, how- 
ever, a purely algebraic approach, and apart from our use of the limiting forms, which enables us to 
deal with the average sample size when sampling is in single units, we feel that the geometrical 
approach is in many ways easier to grasp. 

The matrix method was first suggested to the authors by Mr. G. A. Barnard, who encouraged 
much of the early development of the subject. Our thanks arc also given to Mr. F. J. Anscombe 
for encouragement and advice throughout the writing of this paper, and in particular for the 
idea of considering the limiting forms when b is large; and to Mr. H. J. Godwin for very useful 
help in simplifying some nomenclature and making rigorous corrections in some of the proofs. 

The work was carried out as part of the programme of the Ministry of Supply, and permission 
to publish this paper has been obtained. 

References. 

’ Barnard, G. A., “Sequential Tests in Industrial Statistics,” J.R.S.S. Suppl., 1946, pp. 1-21. 

* Anscombe, F. J. (To appear in J.R.S.S. Suppl., Part II, 1946.) 

• Bartky, W. (1943), “ Multiple sampling with constant probability,” Ann. Math. Statistics, 14 , 363-377. 



1946J 


113 


A Modified Probit Technique for Small Probabilities. 

By M. S. Bartlett. 

I. The probit method of statistical analysis has been most comprehensively described by Bliss * 
in connection with the analysis of toxicity data ; but it is of course a genera] method, with applica- 
tions in other helds. In one problem, where the relation with temperature of an occasional but 
^rious failure in a certain type of armament was under investigation, it was found convenient to 
introduce probit technique, but in a modified form. This method, described below, is of general 
application to cases where the range of values of the variable under consideration (temperature, 
dosage, etc.) is of most interest in the region of small probabilities. By an obvious symmetry 
in the theory between small probabilities and high ones, the method is equally available in appro- 
priate instances when the dosage * required for a high percentage kill, 99 or 99 9 per cent., is being 
estimated. 

The principle is to replace the direct method of sampling by the inverse one, in which the 
number of occurrences determine the size of sample. Such inversion, which is of increasing 
application in sampling problems,t is of course only possible in any instance if the observational 
units or “ individuals ” can conveniently be sampled one by one. 

The practical procedure recommended was to choose a dosage round about the 50 per cent, 
point sample until a given number of “ survivors have occurred (two was adopted as a con- 
venient number), increase the dosage by a suitable interval, repeat the sampling, and so on until 
sufficient dosages have been tested or the individuals set aside for the experiment have all been 
treated. 

The advantage of this method is that individuals are not wasted at dosages which are irrelevant 
to the purpose of the experiment, the sampling rapidly entering the region of small probabilites. 
For example, in an artificial sampling experiment starting at the 50 per cent, point (p = i), and 
with intervals equal to = J/p, where a is the standard deviation of the underlying normal 
distribution and p the true slope of the probit-dosage regression line, the distribution of 500 in- 
dividuals shown in Table 1 was obtained (using random sampling numbers). 

Table 1. 


Results of samplinf' experiment. 


Deviation 

1 

1 Probability 

Number of 
individuals 

Number of 
survivors 

0 

0-500 

3 

i 2 


0-309 

10 

2 

la 1 

0-159 

7 

2 

4<t 

0-067 

18 

1 2 

2a 

0-023 

90 

2 

2io 

0-0062 

372 

1-h 



500 



In practice we do not of course know either the 50 per cent, point or cr, but this does not affect 
the principle of the method ; the only practical point is that if the intervals happen to prove rather 
too large, it might be advisable to interpolate one or two points before using up all the individuals 
at our disposal. 

* It is convenient to use this terminology, provided it is understood that the method is not confined 
to toxicological data. 

t See, for example, the correspondence in Nature by Barnard, G. A., Case, R. A. M., Haldane, J. B. S., 
and Twe^ie, M. C. K. (vol. 155, pp. 49 and 453, vol. 156, pp. 115 and 208), the last-named proposing 
the adjective “ inverse.” 





1 14 Bartlett— v4 Modified Probit Technique for Smcdl Probabilities 


2. The appropriate analysis for data obtained by this modified technique is next 
an event with probability of occurrence p first occurs at the w-th trial, its likelihood is 


and its logarithm is 
whence 


p(\ - pp-^ 


L = log p -h (« - 1) log (1 - p) 



dp p r ~ >’ 

= Hp\ say, = 1 - p\ 


since E(n) 1 //?. 

For the case of the second occurrence at the n-th trial, we have 


0L ^ 2^ np 
dp pq 


hip)- 


1_ 

p^q 


Table II. 

Modified probit analysis, 

(Weighting coefficients, etc., in the case of two occurrences) 


Expected 

probit 

Percentage 

probability. 

lOOp 

Factor p!z 

Weighting 

coefficient, 

2z*ip*ti 

Expected 

probit 

Percentage 

probability, 

lOOp 

Factor p!z 

M 

0 00481 

0-2421 

34-1 

4-6 

34-458 

0-9357 

1-2 

0 00723 

0-2478 

32-6 

4-7 

38-209 

1-0018 

1-3 

00108 

0-2538 

3M 

4-8 

42-074 

1-0759 

1-4 

0 00159 

0-2600 

29-6 

4-9 

46-017 

1-1593 

1-5 

00233 

0-2665 

28-1 

5-0 

50-000 

1-2533 

1*6 

00337 

0-2734 

26-8 

5-1 

53-983 

1-360 

1-7 

0 0483 

0-2806 

25-4 

5-2 

57-926 

1-481 

1-8 

0 0687 

0-2882 

24-1 

5-3 

61-791 

1-620 

1-9 

0 0968 

0-2962 

22-8 

5-4 

65-542 

1-780 

20 

01 35 

0-3046 

21-6 

5-5 

69-146 

1-964 

21 

01 87 

0-3134 

20-4 

5-6 

72-575 

2-178 

2-2 

0-256 

0-3228 

19-2 

5-7 I 

75804 

2-428 

2-3 

0-347 

0-3327 

18-1 

5-8 ! 

78-814 

2-721 

2-4 

0-466 

0-3432 

17-1 

5-9 

81-594 

3-067 

2-5 

0-621 

0-3543 

16 0 

6-0 

84-134 

3-477 

2-6 

0-820 

0-3660 

15-0 

6-1 

86-433 

3-968 

2-7 

1-072 

0-3786 

14-1 

6-2 

88-493 

4-557 

2-8 

1-390 

0-3919 

13-2 

6-3 

90-320 

5-271 

2-9 

1-786 

0-4062 

12-3 

6-4 

91-924 

6-139 

30 

2-275 

0-4214 

11-5 

6-5 

93-319 

7-205 

31 

2-872 

0-4376 

10-7 

6-6 

94-520 

8-521 

3-2 

3-593 

0-4551 

10-0 

6-7 

95-543 

10-159 

3-3 

4-457 

0-4739 

9-3 

6-8 

96-407 

12-211 

3-4 

5-480 

0-4940 

8-7 

6-9 

97-128 

14-899 

3-5 

6-681 

0-5158 

8-1 

7-0 

97-725 

18-101 

3-6 

8-076 

0-5394 

7-5 

7-1 

98-214 

22-330 

3-7 

9-680 

0-5649 

6-9 

7-2 

98-610 

27-797 

3-8 

11-507 

0-5926 

64 

7-3 

98-928 

32-945 

3*9 

13-567 

0-6227 

6-0 

7-4 

99-180 

44-288 

40 

15-866 

0-6557 

5-5 

7-5 

99-379 

56-696 

41 

18-406 

0-6917 

5 1 

7-6 

99-534 

73-277 

4-2 

21-186 

0-7313 

4-7 

7-7 

99-653 

95-627 

4-3 

24-196 

0-7749 

4-4 

7-8 

99-744 



4.4 

27-425 

0-8230 

4-1 

7-9 

99-813 



4-5 

30-854 

0-8764 

3-8 

8-0 

99-865 

— 


[No. 1, 
derived. If 


Weishtine 

coefficient, 

2z*Ip*q 


3-5 

3-2 

30 

2*8 

2-5 

2-3 

2*2 

20 

1-8 

1-6 

1-5 

1-4 

1*3 

M 

10 

0-9 

0-8 

0-7 

0-7 

0-6 

0-5 

0-4 

0-4 

0-3 

0-3 

0-2 

0-2 

0-2 

01 

01 

01 

01 

00 

00 

00 




1946] 


Bartlett — A Modified Probit Technique for Small Probabilities 115 

This case is most relevant here, and is treated in detail. The corresponding information function 
hi y) on the probit value Y is given by 


dL (2 — np)z 
pq 




p^q 


(I) 


where z is the Gaussian ordinate corresponding to the probit value Y, The method of fitting is 
to use the estimates p = Ijn, the corresponding weighting coefficients obtained from the provisional 
probit line being given by h{Y), tabulated in Table II. As p->o, h(Y)-^2(5 - K)-. Strictly 
the sampling should be completed at each dosage, but if owing to the restriction on total number 
of individuals available, it remains incomplete, little error is introduced if for the last point the 
standard probit theory and weighting coefficient (ref. 1 or 3) is used (this procedure would be exact 
if prior to the sampling at this dosage it had been decided to use up all the remaining individuals 
at the same dosage). 

In a more precise analysis the provisional regression line would also be used to obtain adjusted 
probit values for all the observations, those corresponding to two occurrences being substituted by 
y, where {cf ref. 1, p. 164) 

pq p-q 

or 

V = Y -h pii - lnp)lz ( 2 ) 


The final estimation of the probit line proceeds as usual, whence the fiducial limits for an x corre- 
sponding to a given Y are given by (ref. 2, p. 325) 

^ , h(Y-y) ± tViofiY - yr | b,W - tW)) ... 

x-x-t .... (3) 

in the usual notation, and being the estimated variances of b and y. 

In the original problem, the value of x corresponding to a “ survival probability ” of i in 500 was 
required— /.i?., to the value T — 2-1218. 

3. It may be helpful to illustrate the analysis by means of the sampling data of Table I. The 
details are shown below, the analysis being similar to the standard analysis except that for all 
points up to the last the weighting coefficients (corresponding to the probits obtained from a 
provisional line drawn by eye) are taken from Table 11. For convenience of computation the unit 
of X is i<j. 


X 

y 

w 

wx 

wy 

0 

5-43 

1-8 



9-774 

1 

4-16 

3-0 

3-0 

12-480 

2 

4-43 

4-5 

90 

19-935 

3 

3-78 

7-2 

21-6 

27-216 

4 

2-99 

11-1 

44.4 

33-189 

5 

2-22 

19-4 

97-0 

43-068 

— 

1 — 

47-0 

175-0 

145-662 


wx^ 

wxy 

wy* 

748-40 

482-094 

491-024 

651-60 

542-359 

451-434 

96-80 

-60-265 

39-590 

37-519 



X* - 2-071 (4 d.f.) 








1946 ] Modified Probit Technique for Small Probabilities 117 

In the interpretation of these figures considerable caution is advised. The immediate conclusions 
are: — 

(o) on the assumptions made, including that of linearity of the probit line over the whole 
range of x, the standard and modified schemes (cases (1) and (2)) are comparable in accuracy. 
They appear equivalent in accuracy at a: = 2Jct, at x la the standard scheme appears some- 
what worse, and at jc -- 3a somewhat better. 

{h) there appears (case (3)) no point in using further individuals (over, say, the soo used 
in the original experiment) to attempt to explore directly the region x 3a and over. 

To these conclusions must be added, however, the following comment. It has been pointed 
out that the dominant contribution to the error for large x arises in the standard method from the 
second term, which corresponds to the error in the regression coefficient h ; this is determined in 
the representative standard scheme over the range x ^ -- 2a to 2a. While in the original problem 
evidence available indicated that the probit line was linear over a sufficient range of x in the relevant 
region, any unwarranted extension of this assumption was to be avoided. While the values listed 
in Table III for x = 2\a are equal, the contribution to the total variance of an estimated x due to the 
error in the coefficient h is much less in the modified than in the standard method, in spite of the 
actual formal error in determining h being greater. This implies that the modified technique will 
be less sensitive to moderate departures from linearity, as well as having the advantage that b is 
determined from observations nearer the region of interest. 

These advantages are summed up in the commonsense view which first dictated the form of the 
proposed procedure. To estimate an x for given small /?, we require knowledge on the location 
and slope of the probit line in the relevant range of jc. The proposed method accumulates informa- 
tion on both these parameters as nearly as possible in the relevant range, with a minimum of extra- 
polation from other ranges. 


Acknowledgment is made to the Chief Scientific Officer, Ministry of Supply, for permission to 
refer in this paper to work arising in connection with my war-time employment in that Ministry. 


References 


1 

2 

3 


Bliss, C. I., Annals of Applied Biology^ 22 (1935), 134-167. 

Idem, ibid., pp. 307-333. 

Fisher, R. A., and Yates, F., Statistical Tables for Biological, Agrictfltural and Medical Research (2nd 
ed., 1943). 



118 


[No. 1 


The Analysis of a Series of Experiments by the Use of Punched Cards. 

By O. Kempthorne, 

Rothamsted Experimental Station, Harpenden 

Introduction, 

The 2 " type of experiment is of great practical utility and the design and analysis of this type of 
experiment have been fully described by Yates.^ The analysis consists of the evaluation of 
2" — 1 treatment effects, each effect being the difference between the means of two sets of 2’‘’^ 
plots, and Yates gives a method of obtaining these effects by a process of continued additions 
and subtractions. This method is extremely convenient for a single experiment, but becomes 
laborious when a large number of experiments have to be analysed. A method of analysis by the 
use of punched cards was devised and tested out on a series of experiments on the manurial 
requirements of sugar-beet carried out under the direction of Rothamsted Experimental Station. 
The present paper describes the method applied to that series of experiments. 

The experiments. 

At each centre the experiment was of the standard 2x2x2 x 2x2 factorial type, the treat- 
ments consisting of all combinations of the following five factors : — 

nitrogen : nil or o*8 cwt. N as sulphate of ammonia per acre (/i), 
phosphate: nil or l o cwt. PaO® as superphosphate per acre (p), 
potash: nil or i*2 cwt. KaO as muriate of potash per acre {k\ 
salt : nil or 5 0 cwt. agricultural salt per acre (5), 
boron : nil or 20 lbs. borax per acre (h). 

Only one replicate of the 32 combinations of the 5 factors was used. The plots were arranged 
in 4 blocks of 8 plots, confounding 3 high-order interactions. The observations carried out on 
each plot which required statistical analysis were : — 

weight of dirty roots, 
tare factor, 
sugar percentage, 
weight of tops, 
plant number, 
purity, 

noxious nitrogen. 

Analyses were made of these seven sets of observations, together with two sets of derived results — 
the weight of clean roots and total weight of sugar. As there were 21 experiments in all, a total of 
189 analyses had to be carried out. 

The usual method of analysing experiments of this type is to evaluate by repeated additions 
and subtractions all the 31 treatment effects, each treatment effect being the difference between 
the averages of two sets of 16 plots. The treatment effect of /i, for example, is the difference between 
the mean of the 16 plots with n and the mean of the plots without n and is called the N effect. 
Symbolically the effects may be written in the following way : — 

N -- in - \)(p + \) (k + \) (s + 1) ib -f 1), 

NP - if in -Dip - 1 ) ik 4- 1 ) + 1 ) ib -f 1 ), 

and so on to 

NPKSB ^isin-Dip-Dik-D is - 1 ) ib - 1 ), 

where the right-hand side, when expanded, gives the particular combination of the plot yields for 
each effect, 1 being the plot receiving none of the treatments, n the plot receiving sulphate of ammonia 
only and so on. 

The treatment effects confounded with blocks were NSB, PKB, and NPKS, and treatment 
effects NPB, NKB, NPKB, PSB, NPSB, KSB, NKSB, PKSB and NPKSB were used to estimate 



1946 ] Kempthorne — Analysis of Experiments by the Use of Punched Cards 119 

the experimental error. The partition of the degrees of freedom in the analysis of variance in each 
of the experiments was therefore as follows 



D.F. 

Blocks 

3 

Treatments 

19 

Error 

9 

Total ... 

31 


Since, however, each of the 31 degrees of freedom corresponds to a treatment effect, it was not 
necessary to carry out the formal analysis of variance. 

Hollerith cards and machines. 

The usual 8o-column card was used. Eacli column has twelve positions called, T, A^, o, i, 2, 
3. 4. 5. 6, 7, 8, 9. In numerical work, the “ T” and “ positions, which lie above the “ o 
position, are not generally used, except for special purposes, such as controlling machine processes 
or to convey qualiiative information. The X and Y positions are principally used to enable any 
letter of the alphabet to be punched in one column by a two-hole code. 

The machines used in the analysis were the sorter, reproducer, multiplying punch and senior 
rolling total tabulator. As its name implies, the sorter is used to separate cards according to 
what is punched in any one desired column ; the theoretical speed of sorting is 24,000 cards per 
hour, but on the average a good working speed is about 20,000 cards per hour. The function of the 
reproducer is to punch information from cards in other cards with the same designation in pre- 
assigned columns, and the maximum speed for this type of work is 6,000 cards per hour. The 
reproducer can also be used as a summary punch to punch on cards information obtained by a 
tabulator. The multiplying punch multiplies a numerical field on the card by another field on the 
same or a different card, and punches the product on the card after rounding it off to the desired 
number of figures. The greatest possible number of digits in the multiplier and the multiplicand 
is 8, and as many digits as are required of the product may be punched on the card. 

The function of the tabulator is to add up numerical fields obtaining totals for groups into which 
the cards have been sorted. The latter is effected by what is known as control. The tabulator is 
plugged to read the cards on the columns by which the cards are classified into groups : this in- 
formation is read from the card at one cycle before the field on the card is actually tabulated, and 
it is compared with the information on the card then being tabulated; if the information is the 
same, the process is continued, but if not, the card feed stops, and whatever totals the machine is 
directed to take at the end of a card group are taken and dealt with as required. The possible 
things to be done with the totals are that they be printed and/or taken (technically known as 
“ rolled ”) to other counters, added or subtracted. The tabulator has distributors which enable 
the numerical field on the card to be taken to one of several counters, according to the information 
punched in another part of the card, or which can be used for treating totals differently according 
to their designations. The particular operations performed by the reproducer, multiplying punch 
and tabulator are controlled by switches and removable plugboards : for example, on the tabulator 
there are three separate plugboards and a large number of switches, and when these are set cor- 
rectly all that is required is to feed in and take out the punched cards and to keep the machine sup- 
plied with paper. The larger tabulators have 6 counters of 9 or 1 1 wheels. More detailed descrip- 
tions of some of the Hollerith machines are given by Comrie,^ and Comrie, Hey and Hudson.^ 

General description of the method. 

After preparation of the data involving multiplications, conversions from plot units to yields 
per acre and so on, the treatment effects are tabulated. The treatment effects are obtained by 
adding or subtracting each plot value, with division by a factor of 16 at some stage. Each plot yield 
occurs in each treatment effect either with a positive or negative sign, and this information is punched 
on the card. In all 31 (= 2® — 1) columns on the card were used for this purpose. When 
tabulating a particular treatment effect, the cards are fed through the tabulator, and the numerical 
field being treated is led, according to whether the distribution column indicates a positive or negative 
sign, into one of two counters — the counter accumulating the positive contributions or that 



120 Kemfthorne— Analysis of a Series of INo. I, 

accumulating the negative contributions. At the end of the group of cards — that is, of the 32 
cards of a centre — the amount in the negative counter is subtracted from that in the positive counter 
and the result printed, with a symbol if negative. As the treatment effects were to be analysed 
further, they were punched on cards: this could have been done automatically by linking the sum- 
mary punch to the tabulator, but in this analysis it was not convenient to do so. 

The analysis falls into the following well-defined stages : — 

(1) preparation of the raw data for punching, punching and checking of punching, 

(2) operations to be carried out on the raw data by the multiplying punch, 

(3) the analysis of original and derived data and the preparation of summary cards con- 
taining the treatment effects, 

(4) further analyses on the summary cards. 

As the object of this paper is to give an example of the technique of the use of punched cards, the 
actual analysis will be described in fair detail. 

The original data and their allocation to the cards. 

The original data are listed below, together with the columns which they occupied. 


Reference 


No. of i 
1 columns 

Actual 

columns 


Centre 

2 

56-57 


Plot number 

2 i 

58-59 

a\ 

Weight of dirty roots in lbs. per plot 

4 

60-63 

al 

Tare factor 

2 

64-65 


Sugar percentage 

3 

66-68 

r/4 

Weight of tops in lbs per plot 

3 

69-71 

a5 

Plant number 

3 

72-74 

a6 

Purity 

3 i 

75-77 

al 

Noxious nitrogen 

3 ! 

78-80 


A two-column code was used for each centre, one column giving its soil group, and the second 
its reference within a soil group. A two-column code from 01 to 32 was used for the plot number, 
01 being the plot receiving no treatment, 02 the plot receiving //, 03 the plot receiving p, and so on in 
the standard order, the plot receiving //, p, k, s and b having the code number 32. It will be noted 
that the data were punched at the right end of the card : this is always done when the card is not 
completely filled, to save time in punching. To ensure complete accuracy of punching, in addition 
to verification the punched data were tabulated to give totals for each centre of all the numerical 
fields which were checked against totals obtained by hand. 

Use of the multiplying punch, 

A number of multiplications and conversions were made by the multiplying purtch. In all 
cases the conversions were carried out to give the results in terms of one-sixteenth of pn acre, so that 
the tabulator would produce the effects in the units finally required, with the exception that the mean 
would be ^iven as twice its correct value. The multiplications arc set out in the table below ; 
this table gives the following information : — 

(1) reference of item obtained, 

(2) name of item, 

(3) its method of construction in terms of the data available (Cj, Cj, C 3 and C 4 being 
conversion factors from lbs. or number per plot to tons or number in thousands per ^ acre 
and Cfi being t^), 

(4) whether the multiplication is such that the multipliers are obtained from each detail 
card separately (individual multiplication, denoted by I) or from a master card which gives the 
multiplier for a group of detail cards (group multiplication denoted by G), 

(5) the number of columns required for the punching of the result of the multiplication. 

A further instruction was necessary to indicate at which digit in each product rounding-off 
should take place. On this machine rounding-off is accomplished by what is known as the '' \ 





1946 ] 


Experiments by the Use of Punched Cards 


121 


Refer- 

ence 

Item 

How 

obtained 

Type of 
multiplica- 
tion 

No. of 
columns 
required 

Position of 
product, 
actual 
; columns 

A 1 

Dirty roots in tons per iVth acre 

(al) X Cj 

1 G 

4 

1- 4 

A 2 

Clean roots in lbs. per plot 

(al) X (a 2 ) 

I 

4 

5- 8 

A 3 

Clean roots in tons per T^th acre 

(A2) X Cl 

G 

1 4 

9-12 

A 4 

Total sugar in lbs. per plot 

(<i3) X /1 2 

I 

4 

1.3-16 

A 5 

Total sugar in cwt. per ^th acre 

(A4) X Ct 

G 

4 

I7~20 

A 6 

Tops in tons per ilrth acre 

{a4) X C ’3 

G 

4 

21-24 

A 1 

Plant number in thousands per ^th acre 

ia5) X C4 

G 

4 

25~ 28 

A 8 

Tare factor by ^ 

(a2) X C\ i 

G 

3 

29-31 

A 9 

Sugar percentage by A 

(«3) V C 5 

G 

3 

32-34 

A\0 

Purity by tV 

(^ 6 ) X C\ 

G 

3 

35-37 

AW 

Noxious nitrogen by -jr 

ial) < C\ 

G 

3 

38-40 


pick up ” : when it is desired to round off the product at a certain position, s is added to the number 
at one position to the right before reading off the product to be punched. In the case of the group 
multiplications, a check was obtained by comparing the sum of the products with the product of 
the multiplier and the sum of the multiplicands : individual multiplications were checked by re- 
peating the multiplication on a different machine and checking the cards for double punching. 
The speed of the multiplying punch depends on the number of digits in the multiplier : with a 
multiplier of 4 digits it is about looo cards per hour, and this was approximately realized. 


Tabulation of the treatment effects. 

Prior to the tabulations of the effects it was thought advisable to reproduce the material actually 
to be analysed on to new cards, as a further 31 columns are required on the card for the information 
giving the distribution. It would have been possible to use the X and Y positions of columns occupied 
by numerical fields for the distribution of these fields, but as some of the fields were obtained by a 
long series of multiplications, and cards are occasionally (though comparatively rarely) torn up 
by the machines through faulty handling, the relevant information was put on new cards. The 
reproduction was made as follows : 


Item ! 

New card 
columns 

Centre 

33, 34 

Plot designation 

35, 36 

Dirty roots, tons /acre 

37-40 

Clean roots, tons /acre 

41-44 

Total sugar, cwt. /acre 

45-^8 

Tops, tons /acre 

49-52 

Plant number, thous./acre 

53 56 

Tare factor 

57-59 

Sugar percentage 

6a-62 

Purity 

63-65 

Noxious nitrogen 

66-68 


The information giving the distribution of the plot values for the treatment effects was then 
punched on the cards : all cards were punched i ” in the first column and in columns 2 to 32 
were punched “ i ” or “ o,” according to whether the contribution of the plot to the effect was 
positive or negative. The distribution columns were as shown on the following page. 

The actual information gang-punched on columns 2 to 32 of the detail cards is easily obtained by 
examining the expressions for the treatment effects in terms of the original plot yields. Thus, 
cards of plots which received both n and p or neither had “ i ” in column 4 and the others had “ o.’’ 

The treatment effects. 

The 32 treatment effects were obtained by running the cards through the tabulator, using 



122 


Kempthorne — The Analysis of a Series of 


[No. n 


Effect 

Column 

Effect 

Column 

Mean ( x 2) 

1 

B 

17 

N 

2 

NB 

18 

P 

3 

PB 

19 

NP 

4 

NPB 

20 

K 

5 

KB 

21 

NK 

6 

NKB 

22 

PK 

1 

PKB 

23 

NPK 

8 

NPKB 

24 

S 

9 

SB 

25* 

NS 

10 

NSB 

26 

PS 

11 

PSB 

27 

NPS 

12 

NPSB 

28 

KS 

13 

KSB 

29 

NKS 

14 

NKSB 

30 

PKS 

15 

PKSB 

31 

NPKS 

16 

NPKSB 

32 


individual card distribution on the relevant columns. For example, the cards were first sorted to 
columns 33, 34 (centre), on which also control was effected : the N effect was obtained by dis- 
tributing on column 2, the fields of cards with “ o ” in column 2 going into counter number one, say, 
and those with “ i ” in column 2 going into counter number two. At the end of the group of cards 
(32 in number) for each centre, when the control on columns 33 and 34 operated, counter one 
contained the negative part of the treatment effect and counter two the positive part. The number 
in counter one was then rolled into counter two to obtain a grand total as a check. On the next 
cycle the number in counter one was rolled into itself, and on the following cycle the resulting 
number in counter one was subtracted from that in counter two. The result in counter two was 
the treatment effect, which was printed as a true balance — that is, the actual number if positive, 
and the actual number and not its complement, together with a symbol, if negative. All the 
instructions to the machine were permanently set up on its plugboards with two exceptions : — 

(1) the fields on the card to be added and distributed, 

(2) the column on which the cards are to be distributed. 

When the plugging of (i) above had been done, the cards were run through a total of 32 limes, 
distributing on column i on the first run, on column 2 on the second run, and so on. In all, 9 
fields had to be added and distributed, and as two fields could be tabulated in one run, five multiple 
runs (i multiple run equals 32 single runs) were required. 

A part of the result of a typical run is given below (the 5 effect on total sugar and tops at each 
of the centres) : — 


11 

5816 

254* 

21668 

684* 

16161616 

21 

7472 

272 

26282 

1040 

16161616 

22 

3360 

206 

15762 

140* 

16161616 

23 

5695 

645 

22483 

1111 

16161616 

24 

6944 

120 

25986 

738 

16161616 


The first column gives the code of the centre, the second gives for total sugar twice the mean (a 
check), and the particular effect for each centre to the nearest hundredth of a cwt., and the third 
colunin gives the same results for tops to the nearest thousandth of a ton. An asterisk denotes a 
negative effect. The fourth column is entirely a check column, indicating that the distribution 
has been carried out correctly. 



Experiments by the Use of Punched Cards 123 

As the treatment effects are obtained for each centre, the number of cards in each card group is 
32. The time for each group of 32 cards was as follows 

0*6 sec. for listing and tabulating the first card 
12-4 secs, for tabulating remaining 31 cards 
1 *2 secs, for rolling operations 
1 *2 secs, for printing 
0*4 sec. for zeroising the counters 

15*8 secs. 


One single run giving a particular treatment effect for two sets of observations for all 21 centres 
should have taken, theoretically, about 6 minutes. The cards were then taken from the stacker 
and placed in the feed, some more paper inserted in the print unit, and the distribution changed. 
Were it not for the fact that the alterations in the plugging at the end of each single run were very 
simple, the complete tabulation would have been rather slow. The only plugging alteration, 
however, was the changing of one plug leading from the drum (giving the distribution designation) 
to the following card column on which the distribution was taking place. This plug leads to the 
“ following card column ” “1 ” for the first run, to “ 2 ” for the second run and so on, leading to 
“ following card column ” “ 32 ” for the thirty-second run. At the end of each group of 32 runs 
a change in plugging was necessary only on the first of the three panels — the changing of the plugs 
from the card columns of the fields to be added to the distributors. In spite, therefore, of the 
tabulation of the treatment effects consisting of i6o separate tabulator runs, the whole was done 
in about 30 tabulator-hours, including time lost for minor troubles of controls acting when they 
should not (giving split totals) and so on. 

The summary cards. 

All the treatment effects at each centre were punched on cards. The constitution of these 
summary cards was as follows (the designation of the treatment effect being the number of the 
column by distribution on which it was obtained):— 


I Actual columns 


Centre 

1, 2 

Designation of treatment effect 1 

3,4 

Dirty roots 

5-9+10 

Clean roots 

11-15 + 16 

Total sugar ! 

17-21 -i- 22 

Tops 

23-27 + 28 

Plant number 

29-33 + 34 

Tare factor 

35-39 1- 40 

Sugar percentage 

41-45 + 46 

Purity 

47-51 + 52 

Noxious nitrogen 

1 53-57 -h 58 


Five columns were necessary for the numerical part of each effect, together with another column 
giving its sign (in this particular instance a “ i ” denoted a positive effect and a “ 5 ” a negative 
effect). 

There were therefore 22 columns remaining on the summary card which could be used to carry 
out supplementary analyses. 

Checking of tabulation. 

For the purpose of checking the tabulation and for other purposes, the designation of the 
treatment effects was supplemented by a five-column code in columns 59, 60, 61, 62, add 63, twice 
the mean being 00000, the N effect 10000, the NP effect 1 1000 and so on. The summary cards were 
run through the tabulator in exactly the same way as the original cards, distributing first on column 
59 and then on column 60. The result of this tabulation should be exactly 32 times the yield of plot 
31 and plot 30—/.^., the plots receiving pksb and nksb respectively. The process could be continued 


124 Kempthorne— Analysis of a Series of INo. U 

to reproduce all the original plot values (multiplied by 32), but a couple of runs were thought 
sufficient to check the whole of the tabulations. If the result of the check tabulation had not been 
correct, it would have been necessary to do the set of tabulations again, since the check only indicates 
that some of the 31 effects are wrong and not which ones; in fact, with this particular analysis, 
the check tabulations brought to light only one tabulating error, which was easily located and 
corrected. 

Supplementary .analysis. 

In previous years the series of experiments consisted of 2 * experiments on the four factors, 
//, p, k and s. It was therefore decided to obtain treatment effects in the absence of h — for example, 
that for the factor n being N — NB. (Since, however, the factor h had little effect, these treatment 
effects were not used.) When the 5-column code for the treatment effects had been put on the 
cards, these quantities were obtained from the tabulator by sorting the cards into groups ignoring 
the last column, and subtracting the field on the card with “ i ” in this last column from that on the 
card with “ o ”. It was first necessary to abstract the cards for effects NSB, PKB and NPKS, 
which were confounded with blocks and were assumed to be zero. Using two counters, all 15 
treatment effects and the mean (multiplied by 2) were obtained for one factor in one run through the 
machine. The rate at which these results were obtained on h tabulator was rather slow, since there 
were in general only 2 cards in each group, the theoretical time for each group of 2 cards being: — 

0*6 sec. for first card listing and tabulating 

0*4 sec. for tabulating the second card 

0*4 sec. for subtracting one counter from the other 

0*6 sec. for printing the result 

0*4 sec. for zeroising the counters 

2*4 secs. 

The theoretical speed was therefore 3000 cards per hour, so that a single run giving the effects at all 
21 centres for two sets of observations took about a quarter of an hour of running time. 

Samples from the 32 plots were grouped for chemical analysis, and it was necessary to obtain 
the yields of the eight combinations of the three treatments, w, and b averaging over the other two 
treatments. At the same time the triple interaction of the three treatments was to be taken as zero. 
The required yields arc compounded of the six treatment effects N, S, NS, B, NB, SB and the mean 
(multiplied by two). The following information was punched on the cards containing these 
effects : — 


Effect 

Designa- 

tion 

64 

Mean ( X 2) ! 

01 

1 

N 1 

02 

0 

S 

09 

0 

NS 

10 ' 

1 

B 

17 

0 

NB 

18 

1 

SB 

25 

i ! 


Column 


65 

66 

67 

68 

! 69 

i JlJ. 


1 

1 

1 

0 ! 

1 i 

1 1 

1 

0 

1 1 i 

I 1 

1 I 

0 


0 

1 

I 1 

0 

0 1 

1 1 


0 

0 

1 

1 

0 1 

1 0 


0 

0 

0 

1 

* i 

i 1 


0 

I 

0 1 

0 1 

i J ! 

1 0 


1 

0 

0 ! 

0 

1 ^ ' 

! 1 



Tabulating these effects, distributing in turn on column 64 to 71 and subtracting the sum of the 
o’s from the sum of the I’s gave the yields from the treatments nil, «, s, ns^ b, nb, sb and nsb averaged 
overp and k at each of the 21 centres. This tabulation was again rather slow, as there were only 
7 cards in each group, and the speed realized was about 2000 cards per hour. 

Presentation of results^ computation of errors and supplementary hand analysis. 

The treatment effects were listed in order for convenience in abstracting and other work. A 
section of the listing showing the effects at one centre for total sugar (in units of one-hundredth of a 





1946 ] 


Experiments by the Use of Punched Cards 


125 


cwt.)» and for tops (in units of one-thousandth of a ton) is given : a “ i ” attached to a particular 
effect indicates that it is positive, and a “ 5 ” that it is negative. 

It was not considered economic to obtain the errors by the use of Hollerith machines. These 
were obtained very rapidly by listing the 9 treatment effects which were used to estimate the error. 
The sum of squares of these 9 effects multiplied by 8/9 gives the error mean square and standard 


error of each treatment effect is 



A number of covariances — e.g., of yield on plant number— had to be examined, and this was 
also facilitated by the separate listing of the treatment effects which were used to estimate the error 


Centre 

1 Effect 

I Total sugar 

1 Tops 

23 

01 

5695 

1 

22483 

1 

23 

02 

101 

1 

3409 

1 

23 

03 

27 

1 

283 

5 

23 

04 

83 

5 

145 

5 

23 

05 

465 

1 

1263 

1 

23 

06 

167 

1 

485 

1 

23 

07 

113 

1 

289 

1 

23 

08 

3 

1 

435 

1 

23 

09 

645 

1 

nil 

1 

23 

10 

179 

1 

33 

1 

23 

i> 

91 

5 

419 

5 

23 

12 

103 

1 

413 

5 

23 

13 

197 

5 

881 

5 

23 

14 

151 

5 

455 

5 

23 

15 

159 

1 

229 

1 

23 

16 

1 

1 

53 

5 

23 

17 

51 

5 

633 

5 

23 

18 

41 

5 

327 

5 

23 

19 

75 

5 

419 

5 

23 

20 

123 

1 

557 

5 

23 

21 

11 

1 

299 

1 

23 

22 

171 

5 

741 

1 

23 

23 

249 

5 

295 

5 

23 

24 

93 

1 

791 

1 

23 

25 

93 

5 

765 

5 

23 

26 

37 

1 

111 

5 

23 

27 

105 

5 

491 

5 

23 

28 

109 

1 

937 

5 

23 

29 

173 

1 

371 

1 

23 

30 

63 

1 

545 

1 

23 

31 

107 

5 

221 

1 

23 

32 

5 

5 

j 

559 

1 


If it were desired to test the regression of the result of observation B on that of observation for 
example, it was necessary only to form the sum of products — [AB] say, of the error treatment 
effects of A and B, The sums of squares of the A effects [A A] and of the B effects {BBi\ would have 
been obtained already in the computation of the errors and the test of significance is the variance 
ratio test of [ABfl[AA] against {[BB] — [ABfl[AA])l^ with i and 8 degrees of freedom 
respectively. 

General considerations. 

The actual analysis was carried oiit by Hollerith in about six days, two days being taken for 
punching and multiplying, two days tabulating using two tabulators, and two days for supple- 
mentary work. The computation of errors and classification of the results took a comparatively 
trivial time. This may be compared with the time required in doing the analysis by hand using 
electric calculating machines ; it is estimated that four computers would be occupied for something 
of the order of six weeks doing the same number of analyses. From the point of view of cost there 
is little to choose between the two methods, but the time factor was important, for the results of the 
experiments were not all available until early in January, and the analyses were required by the end 
of February, when the experimental programme for the next season came under consideration. 



[No. I, 


1 26 Kempthorne— 7%^ Analysis of a Series of 

The time factor is also important, in that the analysis of most agricultural experiments must 
carried out during the following winter, and imposes a heavy seasonal load on a computing staff. 
The method described does not require skilled computers to a large extent, and therefore can be 
used to relieve the seasonal load. 

The analysis of this type of experiment by the use of punched cards is of particular value, in 
view of the present emphasis on series of similar experiments rather than on single experiments of 
varying designs. 

Application of the method to 3" factorial designs. 

The simplicity of the analysis of the 2“ type of experiment lies in the fact that the treatment 
effects are all linear functions of the plot yields with coefficients of plus or minus one-sixteenth. 
Another common type of experiment is the 3** testing all combinations of n factors each at three 
levels. In the case of one replicate of a 3“ experiment arranged in three blocks of nine, with one 
pair of the eight triple interaction degrees of freedom confounded with blocks, the partition of 
the degrees of freedom in the analysis of variance of the results is frequently as follows (assuming the 
three factors to be p, k ) : — 


Blocks ... 
r linear 
I curvature 
pf linear 
^ I curvature 
r linear 
I curvature 
NF linear by linear 
NK linear by linear 
PK linear by linear 
Error 






D.F. 


2 

1 

1 

1 

1 

1 

1 

1 

1 

1 

15 


Total 


26 


The evaluation of the total sum of squares presents little trouble, particularly if the plot yields 
are listed conveniently and a fully automatic Monroe calculating machine is available. The 
linear effect of n (the levels being o, i and 2 units) is N' — “ M)* where is the sum of 

the plots receiving n at level 2 and so on. The curvature is given by the equation N" -- .'.{[//o] — 
2[«i] 4 - [wq]} and the li near-by-linear n and p interaction is N'P' =- iilfioPo] ~ In^Pt)] — [woPal + 
[ti 2 P 2 ])- The blocks may also be regarded as a three-level factor (b) and the two degrees of freedom 
split into a linear and a curvature degree of freedom. It is then possible to obtain by distribution 
four linear effects B\ N\ P\ K\ four curvatures N'\ P'\ K'\ and the three linear- by-li near two- 
factor interactions, but without the appropriate numerical divisors. In the case of the 2” type of 
experiment the divisor is the same for all treatment effects, and no difficulty arises. Given the 
effects without the divisors, however, the completion of the analysis of variance by inserting the 
individual degrees of freedom sum of squares, consisting of a certain fraction of the square of 
the number obtained by the tabulator, would be a light task. In order to make the analysis com- 
pletely self-checking, it would be advisable to evaluate the square for each degree of freedom — i.e.y 
in addition to the above, the 6 linear-by-curvature and 3 curvature-by-curvature two-factor inter- 
actions, and the four pairs of triple interactions Y and Z (one of these being identical with 

blocks), each of which would be split into two separate degrees of freedom. 

The above method is analogous to that used in the analysis of 2 " experiments, in that each degree 
of freedom corresponds to a treatment effect consisting of a simple linear combination of the plot 
yields. In certain instances, however, ii is desirable to examine all the interactions of pairs of 
factors. In the case of a 3* experiment, for example, it is possible to obtain a reliable estimate of 
the experimental error using triple and quadruple interactions only. In this case with one replicate 
arranged in blocks of 9 plots, the partition of the degrees of freedom could be used, as shown on 
the facing page. 

The use of the method described above for the 3® experiment and the construction of three-by- 



1946] 


Experiments by the Use of Punched Cards 


127 



D.F. 

Blocks 

8 

Main effects 

8 

Two-factor interactions 

24 

Error 

40 

Total 

80 


three tables, showing interactions between pairs of factors, from the separate treatment effects would 
be laborious, and it would be simpler to prepare tables showing these interactions directly from the 
tabulator. A table showing the totals of plots with each of the nine combinations of two factors 
with all the marginal totals can be produced by the tabulator in a single run of the machine. All 
the tables involving two factors for each centre can therefore be produced in six runs of the cards 
through the tabulator. The analysis could then be completed in the normal way. This method 
is, of course, only worth applying when several sets of observations on a large series of experi- 
ments have to be analysed. 

Summary. 

A method of analysing 2” experiments by the use of punched cards is described, and illustrated 
by the analysis of a 2® series of experiments on the manorial requirements of sugar beet. Some 
suggestions are made for the analysis of a series of experiments of the 3" type by the same means. 
The Hollerith work on this analysis was carried out by the British Tabulating Machine Company, 
to whom acknowledgements must be made. 

References 

1 F. Yates, The Design and Analysis of Factorial Experiments. Imperial Bureau of Soil Science, 1937. 

* L. J. Comric, The Hollerith and Powers Tabulating Machines. Private circulation, 1933. . . , 

L. J. Comrie, G. B. Hey and H. G. Hudson, “ The Application of Hollerith Equipment to an Agricultural 
Investigation,” J. Roy. Stat. Soc. iSuppl.)^ 1937, 2, 210-24. 



128 


[No. 1, 


The Statistical Analysis of Variance-Heterogeneity and the Logarithmic 

Transformation. 


By M. S. Bartletf and D. G. Kendall 
(Queens’ College, Cambridge ; Magdalen College, Oxford.) 


I . Preliminary remarks. 

While a useful approximate null test of significance of the heterogeneity of variances is available 
(ref. I ), it is often required in more detailed investigations of variance heterogeneity to apply the 
powerful technique of “ analysis of variance ” to the data, when suitably transformed. For an 
estimate s^ of a variance based on n degrees of freedom, the distribution of ns-lo^ is well known 
to be a x-^-distribution with n degrees of freedom, if the estimate has been obtained from a normal 
sample. It immediately follows that the distribution of In s^ — In o- is entirely independent of a**, 
and hence that that of In only depends on a* through the term In in its mean value.* The 


/2c2 

variate In s- (or any equivalent variate such as In s. In W logio etc.) is thus a convenient variate 

to consider. The above argument clearly applies in the more general case when the original 
population from which the sample yielding was obtained has any distribution of the form 

/ The more detailed discussion of the distribution of In s^ in the next section refers to 

the case of normal samples. 


2. Properties of the distribution of In . 

Some precautions are necessary when the analysis of variance is applied to a transformed variate 
(cf. ref, 2). In the ideal case : 

(a) the transformed variate would be normal, 

(b) its variance would be unaffected by changes in its mean, 

(r) real effects would be additive on the transformed scale, 

(d) the arithmetic mean would provide a fully efficient statistic for the unknown parameter 
in the case of homogeneity. (These requirements are of course not independent, 
the fourth being a con sequence .of the first three.) 

In the present case condition (b) is exactly satisfied. Condition (r) is approximately satisfied 
provided the heterogeneity effects are not large, and are additive on some scale, say the original 
one, since they are then necessarily additive to the first order on the transformed scale. Alterna- 
tively, the heterogeneity effects may be precisely defined by arithmetic addition on the transformed 
scale, though this has the disadvantage that the definition in many applications would be an - 
unnatural one, leading to “ interaction ” effects which would be artificial. With regard to con- 
dition (a), the characteristic function of In s“ is 




whence the cumulant function K(t) In M(/) is 

KU) == /V(ln a* ~ In J/i) -f In F -f — In F , 


and == (In — In J/i) ^ 

(r>0). 

* The notation In is used for Napierian -logarithm. 



1^46] Bartlett and Kendall 129 

where 0(jf) is £^ln T{x)ldx, and is the rth derivative of 4^x).* From these results the values 
of #fi. If,, vi = icg/if,®/* and y, ^ are given for reference in Table I up to /i ^ 20. For larger 
values of /?, it is sufficient to take 



The values of y^ and y, are also plotted in Fig. 1, and the ratio of 2/(n — 1) to icjin Fig. 2. 



n 

Figs. 1 and 2. 

Constants for the distribution of In jr* (see Table I). 

With regard to condition it is noted that in the case of complete homogeneity the in- 
formation in the mean value of a set of statistics In av* tends, as the number in the set becomes 
large, to be proportional to the value of 1 /if2, whereas the information in the sufficient statistic 
is proportional to njl. A measure of the efficiency is thus E = HriK^, and this is also given in 
Table I and plotted in Fig. 3. 

3. Number of degrees of freedom required for individual variance estimates. 

While no hard-and-fast rule can be laid down, the above results suggest that the transformation 
may safely be used for /i = 10 and over, more tentatively from « = 5 to « = 9, and probably not 
at all below /i 5. In the first case E is over 90 per cent. ; in the second between 80 per cent, and 
90 per cent. ; and in the third below 80 per cent. The values of yi and y, indicate, however, that 

* These functions are tabulated in ref. 4. 


SUPP. VOL. VIII. NO. 1 


F 



130 Bartlett and Kendall — The Statistical Analysis of [No. 1, 

the approach to normality is rather slow, and this is confirmed by the relatively small difference 
in the distributions of In s^ for n — 5 and n — 9 (Fig. 4). Of yi ^i^d yz, the latter is more important 
in analysis of variance problems, owing to the tendency for yi to be eliminated when differences 
of the variates are tested. In Fig. 5 the tails of the symmetrical distributions with the same 
values of y 2 arc shown for // = 5, « = 9 and « — , and the slow convergence to normality is 

again illustrated. The nature of the tail for n -- 5 and n — 9 suggests that even for n greater than 
9 the chance of large deviations will be appreciably higher than would be inferred on the normal 



n 


Fig. 3. 

The efficiency of the mean of In for estimating a*. 


Table I 

Constants for the distribution of hi s* 


n 

*1 - In o* 

(i + SnO 

l/*t 

2 


/’t 

E - (2/wk,) 

X 100 

(/I - 1)*» 

1 

-1*27036 

-1*33333 

0*20264 



-1*535 

+4*000 

40*53 

2 

0*57721 

0*58333 

0*60793 

1*2159 

1*140 

2*400 

60*79 

3 

0*36898 

0*37037 

1*0697 

1*0697 

0*917 

1*613 

71*30 

4 

0*27036 

0*27083 

1*5505 

1*0337 

0*780 

1*188 

77*53 

5 

0*21313 

0-21333 

2*0393 

1*0197 

0*688 

0*931 

81-57 

6 

0*17583 

0*17593 

2*5321 

1*0128 

0*621 

0*763 

84*40 

7 

0*14961 

0*14966 

3*0270 

1*0090 

0*570 

0*644 

86*49 

8 

0*13018 

0*13021 

3*5233 

1*0067 

0*529 

0*557 

88-08 

9 

0*11521 

0*11523 

4*0205 

1*0051 

0*496 

0*490 

89*34 

10 

0*10332 

0*10333 

4*5183 

1*0041 

0*469 

0*437 

90*37 

11 

0*09365 

0*09366 

5*0165 

1*0033 

0*445 

0*395 

91*21 

12 

0*08564 

0*08565 

5*5150 

1*0027 

0*425 

0*360 

91*92 

13 

0*07889 

0*07890 

6*0138 

1*0023 

0*407 

0*330 

92*52 

14 

0*07313 

0*07313 

6*5128 

1*0020 

0*391 

0*305 

93*04 

15 

0*06815 

0*06815 

7*0119 

1*0017 

0*377 

0*284 

93*49 

16 

0*06380 

0*06380 

7*5111 

1*0015 

0*364 

0*265 

93*89 

17 

J305998 

0*05998 

8*0104 

1*0013 

0*353 

0*249 

94*24 

18 

0*05658 

0*05658 

8*5098 

1*0012 

0*342 

0*234 

94*55 

19 

0*05355 

0*05356 

9*0092 

1*0010 

0*333 

0*221 

94*83 

20 

-0*05083 

~ 0*05083 

9*5088 

1*Q009 

-0*324 

+0*210 

95*09 



1946 ] VarUmce-Heterogeneity and the Logarithmic Transformation 131 

approximation, and the real significance of observed deviations consequently somewhat less than 
the apparent significance. 

liiis conclusion is also illustrated by a comparison of the direct use of the logarithmic variance 
for a null test of heterogeneity with the approximate test already referred to (ref. 1). Both tests 
make use of the x® approximation, the one now under consideration simply by comparing the 
observed variance of In s^ with its theoretical variance. In Table II the true values of x* required 



Fig. 4. 

The distribution of In (.v*/<7*), expressed as a multiple of the approximate standard deviation 1 — i). 

Table II 

approximation for test I {ref 1) and test H {present test) 




w - 1 

2 

3 

6 

12 

oo 

P = 010 

Test 1 

2-48 

2-66 

2-69 

2*70 

2-70 

2-706 


11 

2-62 

2-64 

2-65 

2-68 

2-69 

2-706 

P = 005 

Test I 

3-39 

3-72 

3-80 

3-83 

3-84 

3-841 


II 

4*25 

408 

401 

3-93 

3-88 

3-841 

P - 0 02 

Test 1 

4-61 

516 

5-31 

.5-39 

5-41 

5-412 


II 

6-99 

6-42 

612 

5*78 

5*59 

5-412 

P - 001 

Test I 

5-54 

6*27 

6-47 

6-60 

6-63 

6-635 


11 

9-52 

8-52 

7-97 

7-32 

6-97 

6-635 


for significance are shown for both tests in the case of two variance estimates with equal numbers of 
degrees of freedom. These values were obtained from the well-known tabulated significance 
levels for s^^js^^ by taking and s^^ such that had the required critical ratio, and calculating 
X* by each of the two methods. The limiting values for « =^- oo given in the last colunm of the 
Table correspond to the critical values of x“ from the tabulated distribution of x* for one degree of 
freedom, and are the values which would be used for significance in practice. It is seen that, apart 
from the requirement of equal numbers of degrees of freedom for the different variance estimates. 




132 Bartlett and Kendall — The Statistical Analysis of [No, 1, 

the present test is much inferior to the other one for small n ; and even for /i = 12 is exaggerating 
significance at the higher significance levels, 

4. Example, 

The following data (Table III) represent logi© three groups A, B and C, the 

columns corresponding to different days’ results. Each item corresponds to a variance estimate 
with average number of degrees of freedom approximately equal to 48 (the number varied slightly 
from item to item). The resulting analysis of variance is shown in Table IV, the theoretical mean 
square being;[(logio ^)V2(/i ~ 1) = 0-(X)20. 



Fio. 5. 

Tail of the distribution of |in (5*/o*)l, scales as in Fig. 4. 



1946] 


Variance-Heterogeneity and the Logarithmic Transformation 


133 


Table III 


Day: 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

A 

MO 

117 

,1-27 

105 

106 

109 

106 

1-20 

109 

105 

M4 

1-25 

119 

Ml 

107 

B 

0-93 

102 

0-92 

0*90 

0-85 

102 

0-98 

0*96 

0-93 

0-97 

0-93 

0-97 

108 

0*88 

103 

C 

105 

0-89 

1 09 

0-92 

0*95 

108 

0*95 

1 

0-98 

0-83 

0-94 

102 

0-99 

103 

0-92 

0*90 


Table IV 



D.F. 

S.S. 

1 

M.S. 

Groups A, B, C 

2 

0-2667 

0-1333 

Days 

14 

0-1047 

0-00748 

Residual 

28 

0-1005 

0-00359 

Theoretical variance 

— 

__ 

0-0020 

Total 

44 

0-4719 

— 


It will be seen that the residual variance is significant (P 0 01) compared with the theoretical 
variance, denoting residual heterogeneity. The “ groups ” and “ days ” terms are therefore tested 
against the residual variance, giving a highly significant “ groups ” effect, and a just significant 
(P -- 0 05) “ days effect. 

5. A method of estimating the amount of heterogeneity. 

In the above example the occurrence of a residual heterogeneity led to the adoption of a different 
basic level of variability against which the other items were tested. A definite form, 

/(///a)c/w/a , where a - E{u\ 

for the distribution of the “ real variance u - a® is implied by this procedure, and effectively we 
are using the residual term in the analysis to estimate the dispersion of the w-distribution, /.c., 
the “amount” of the heterogeneity. There is usually little but common sense and ultimate 
convenience to guide us in postulating a hypothetical form for f{u) ; the analysis of variance would 
be made rigorous if it were possible to choose /(w) so that the logarithm of the observed variance 
(>v) would be normally distributed, but this cannot be done. For let w - uv, so that u and v arc 
statistically independent, and v is proportional to a //-variate. Then the normal distribution of 

In w In w In V 

implies, by a theorem of Cramer,^ that In v is normally distributed, and this gives a contradiction. 

Since it is impossible to justify the analysis completely, wc might try to make the resultant 
distribution of In w satisfy the conditions (b) and (d) of Section 2, by arranging that 

(b) the sampling variance of In w is independent of the mean “ true ” variance a ; and 
id) H In >v is a sufficient statistic for a. 

That this also is impossible follows from an unpublished theorem of L. Solomon (1944),* which 
asserts that a variate (here In w) has these two properties if and only if it is normally distributed. 

It is, however, possible so to choose fiu) that In w has the distribution investigated in Section 2, 
but with the number of degrees of freedom reduced to allow for the inflated variability. Explicitly, 
we choose f{u) so that w is distributed as 

«Xa'V>^ , 

the new parameter X being a measure of the heterogeneity. In the case of complete homogeneity, 
we have of course X n, and in general X will be less than this value. The distribution of w 
being now 

* We arc indebted for this reference to Dr. A. C. Aitken. 





[No. 1, 


134 Bartlett and Kendall — The Statistical Analysis of 

the maximum likelihood estimates for a and X (formed from samples of k independent observations) 
are given by 

w yiwlk — a , 

and t ^ In w II In w — /:{ln (JX) — ^(JX)/. — i) large), 

and it is readily seen that these are a pair of sufTicient statistics : in fact we have (see the Appendix 

for details) 

/}(5|a,X) -=p(w|a,X) p(/lX) piS\w, t). 

The distribution of w is obtained at once from first principles, while that of t follows from the ar^- 
ment given earlier by Bartlett ^ for the homogeneous case, and on employing the approximation 
developed on that occasion we find 

- a „ 

and / ^ i (l + —3^-) x»- approx., 

* 

the two x®-variates being independent. This approximation to the distribution of t is very accurate, 
the first four cumulanls being correct to 5 per cent, even when X is as small as 5. 

To illustrate the method, we shall apply it to the three “ groups ” A, B, and C of Table HI, 
estimating for each a value of X, the heterogeneity parameter. We find : 


Table V 


Group 

t 

Estimate of a 

90% fiducial interval 

A 

0-852 j 

1 16-8 

8*1 to 28-2 

B 

0-583 ' 

' 24-4 

11-6 410 

C 

0-815 1 

1 

1 17-5 

8-4 29-4 


It will be recalled that each variance estimate contributing an entry to Table 111 was based on about 
48 degrees of freedom. If, therefore, the data were homogeneous, the true value of X would be 
48. Since in each case the 90 per cent, fiducial interval for X excludes this value, there is significant 
evidence for heterogeneity, the observed variances fluctuating as if they were based on about 
20 instead of 48 degrees of freedom. Since it is to be expected that the day-to-day variations 
for the three groups may be correlated, the three x*‘variates corresponding to the three values 
of t cannot be supposed to be independent. If they were independent, it would have been possible 
to test the disparity between the three X-estimates by calculating the statistic 

T : 3 In / - S In r, - 3{ln (JA) - 0(iA)}, say, 

and applying the null test corresponding to the above method of estimation, testing whether 
A were significantly less than (A: — 1) 14. This null test, of course, is identical with the test of 

homogeneity given by Bartlett in the earlier paper ’ already referred to, which itself, since the num- 
bers of degrees of freedom arc here equal, is equivalent to the original test of J. Neyman and E. S. 
Pearson.® 

Since the three estimates of X arc not independent, it is not clear how they should be combined. 
Assuming that there is no real difference in heterogeneity behaviour between the three groups, 
and proceeding as if the estimates were independent, we find as the pooled estimate of X the har- 
monic mean of the three previous estimates, 

X - 19 0. 

On a future occasion one could thus assign fiducial limits to the mean “ true ” variance, given 
an observed variance, or test for a real difference between variances observed for the same “ group ” 
on different days, by assigning to each observed variance 19 instead of 48 degrees of freedom, 
and then proceeding exactly as if there were no heterogeneity. One cannot^ however, make any 
statement about the fluctuations to be expected when an observed variance is based on other than 
48 degrees of freedom. This is a severe limitation of the present method, the full force of which 
will be felt when, in the next section, we show that the form f(u) of the implied heterogeneity 
distribution varies with w, the (actual) number of degrees of freedom possessed by the observed 



1946] Variance-Heterogeneity and the Logarithmic Transformation 135 

variances. In some circumstances, therefore, it will be preferable to sacrifice the internal consist- 
ency of the method of estimation developed here, and to return to the more crude but more adapt- 
able method of the logarithmic transformation. 

It is very instructive to analyse the same set of data for heterogeneity by means of the logarithmic 
transformation. In the treatment given above we assumed that an observed variance was still 
a X*- variate, the sole .effect of heterogeneity being to reduce the effective number of degrees of free- 
dom. We now suppose that an observed variance is a constant multiple of the product of three 
independent x*-variates, the first accounting for the normal sampling fluctuations, the second 
fluctuating from day to day, and the third fluctuating from one observed variance to another 
and representing the effect of general heterogeneity in the conditions of the experiment. Allotting 
to these x*-variates /?, {Xi, (Xg degrees of freedom, respectively, we wish to estimate (x^ and (Xj. For 
the natural logarithm of the square-root of an observed variance, the second cumulant is, to a good 
approximation, 

1/1 I 1 , 1 \ 

iL' r* iii, - I ^ (i-2 - i)' 

(A fourth factor, varying with the identity of the group, will not concern us here.) 

Thus, referring to the analysis of variance in Table IV, we have for the “ days ” term, 

\ /.r - o-io47(in io)» = 0-5552. 

and for the “ residual ” term, 

1 1- 1 = 0-1005(ln 10)» = 0-5328. 

From the latter we obtain an estimate (x^ ^ 61 for the general heterogeneity, with a 90 per cent, 
fiducial interval of (25 to 220). Substituting the estimate of (x^ in the expression for the “ days ” 
term, and neglecting the uncertainty in (X2, we obtain an estimate (Xj 74 for the day-to-day varia- 
tions, with a 90 per cent, fiducial interval of (24 to 340). It will be observed how the greater flex- 
ibility of the logarithmic transformation analysis makes it possible to disentangle the “ days ” 
effect from the general heterogeneity, and evidently more complicated situations could be dealt 
with in the same way. 

If the two methods of analysis were equivalent we should have 
X " 1 // - - 1 (Xi — 1 ^ (Xg — i 

— which suggests incidentally how values of X corresponding to different “ actual ” numbers of de- 
grees of freedom, w, might be compared. In fact we have, for the left-hand side, an estimate of 
1/18, and for the right-hand side an estimate of 1 / 1 9. 

A numerical estimate of the “ amount ” of the heterogeneity is of course only of value if there is 
some reason to suppose that the variability in the conditions of experiment itself possesses some 
statistical regularity. But the methods given here provide a rational means of putting this hypo- 
thesis to the test. 

6. The heterogeneity distribution as the “ quotient ” of two f ^-distributions. 

We shall now substantiate the claim made in Section 5 by finding the form 

f(ul<x)dul(x. (a E(u)) 

of the distribution of the “ real ” variance, w, which has the effect of changing the distribution of the 
observed variance w from that of 

^yj/n 

to that of 

aXAV>‘ • 

Thus, if we write w ^ wv, then multiplication by the independent variate u has the effect of reducing 
the number of degrees of freedqjn of v, while leaving its mean value and the form of its distribution 
unchanged. 

It is to be noted that we cannot say “ w = w/v, and so J In m must be (apart from an additive 
constant) a Fisher 2-variate with (X, n) degrees of freedom”, for w and v are not independent. 



136 Bartlett and Kendall — The Statistical Analysis of [No. 1, 

• ... 

We arc, in fact, concerned with a problem in what P. L6vy has called “ the arithmetic of distribution 
functions.” 

The resolution of the problem is formally very simple. Let h(wt(x)dwl(x. and g{v)av denote 
the distributions of w and v, respectively, so that 

h(w)dw = 

while g(v)clv has the same form, X being replaced by n ; then we must have 


Mw) =. mg Q . 


To solve the integral equation, let /^(O) denote the Mellin transform 


j 


and let C^(0), //(O) be formed in the same way with the functions g(v) and h(w). Then clearly 
/'(O) G(0) - H{0) - 1, for 0 - 1 and 2, 

while the integral equation transforms into 

Hii)) - F(O)G(0). 


van Ml) 




the formula for F{0) is 


or equivalently 


/A«- 1 r(ix J_0_- l)r(b) 
i) r(i/i 1 0 ~ ))r(W ’ 


f(u)du - ^ K 1 -- (o < II < ///X), 

while f(u) is identically zero for other values of u. 

It will be observed that the distribution of the ” real ” variance u involves /?, the ” actual ” number 
of degrees of freedom of the observed variance. This fact, as we have already pointed out, makes 
it impossible (without approximation) to apply the first method for the estimation of heterogeneity 
described in Section 5 unless we are concerned always with variance estimates based on the same 
number of degrees of freedom. To illustrate the nature of the distribution for different (fixed) 
values of a/, it has been drawn in Fig. 6 for constant mean and variance. It is easily shown that 
the variance of f{it)du is 

FO) - [F(2)? - F(3) 1 - l + - 1, ^ , say, 

and so in order to have the same variance for two different values of n we must have 

2/x' ^ 1 -f 
1 + 2//?^ l + 2/n * 

In particular, X' - (x when w' x . The parameter [x might, in fact, very well be adopted as an 
(approximate) absolute measure of heterogeneity, independent of the value of /?. For large X 
and n we should then have 

X /I IX 


which agrees with the alternative method of comparing values of X associated with different values 
of n given at the end of Section 5. The example there discussed gave n ^ 48, X = 19, and so 
(X ^ 31. The curves of Fig. 6 represent the more extreme case when /? = 20, X = 10, and so 
pL 20. In either case the extra variability attributed to heterogeneity is of the same order as the 
inherent variability consequent upon the finite size of the samples. The actual constants used in 
the construction of Fig. 6 are given in Table VI. 



194d] 


Variance-Heterogeneity and the Logarithmic Transformation 
Table VI 


137 


n 

5 

10 

20 

oe 

A 

3-793 

6-471 

10 

22 (= m) 


It will be seen from Fig. 6 that when n = S^\ = 3-8, the form of the distribution fiu)du is not 
one that could reasonably be expected in most practical problems, and even when n — 10, X — 6*5, 
the distribution drops to zero at the upper end more sharply than we should wish. Such an infinite 
“ cliff ” for the upper end of the distribution will occur whenever /i — X is less than 2 ; i.e,^ for small 



•5 to ^5 xo 


Fio. 6. 

Distribution of i/ = a* (in the case of heterogeneity) which has the effect of reducing the number of degrees 
of freedom in the distribution of the observed variance w == 5*. 

values of w, and also in cases when the amount of heterogeneity is small (so that X is not much less 
than n). This latter case of failure is disappointing because it makes the link we have established 
between our method of estimating the heterogeneity when it is appreciable and the earlier null test 
of homogeneity (test that \ — n) less satisfactory than at first appeared. However, when appreciable 
heterogeneity is present and the observed variances are based on not too small a number of degrees 
of freedom, the form of the heterogeneity distribution will obviously be quite acceptable as an initial 
assumption. 

If it should prove convenient to combine the assumption of the heterogeneity distribution 
f{u)du with the logarithmic transformation, then it should be noticed that the earlier restriction 
on the use of that transformation to values of n not too small now becomes the (heavier) restriction 
to values of X not too small. 

We wish to make acknowledgement to Mr. F. J. Anscombe with whom these problems were 
discussed at the outset of this investigation, and to the Chief Scientific Officer, Ministry of Supply, 


Bartlett and Kendall 


138 


[No. 1, 1946 


for permission to publish the work. We should also like to thank Mr. J. E. T. Foulgcr for assistance 
in the preparation of the graphs, and calculation of the constants recorded in Table I. 

References. 

1 Bartlett, M. S., Proc. Roy. Soc., A, 160 (1937), 268-282. 

• Bartlett, M. S., J. Roy. Stat. Soc. SuppL^ 3 (1936), 68-78. 

• Cram6r, H., Random Variables and Probability Distributions, Cambridge, 1937 (Theorem 19). 

* Davis, H. T., Tables of the Higher Mathematical Functions, vol. 1 (1933), vol. 2 (1935), Indiana, 

* Neyman, J. and Pearson, E. S., Bulletin de V Academic Polonaise des Sciences et des Lettres, S6rie A (1931), 


Appendix 

The joint sampling distribution of w and t. 

We have asserted that if 

w — ^wjk 

and ’ / “ A: In w — £ In W 4 , 

where the (wt) arc drawn independently from a distribution of the x* form with mean a and with X 
degrees of freedom, then 


/7(5| a, X) = p(w\ a, X) p{t[k)p (5| w,t). 

The justification for this will now be given.* It is obvious at once that w will have the distribution 
of 


and it can also be seen, not quite so immediately, that we shall have 


/>(S'| a, X) = p{w\ a, X) p(S'| w, X). 

The point wc wish to make is that t is a sufficient statistic for X in the conditional distribution p(S | w, X). 
To see this, we observe that the Joint sampling distribution for the k members of 5 is of the form 


F(wi -j Wa . , . -f >v*)(WiW 2 • • • yVkV'^dwidwz . . . dw/^ (each 

and wc transform the variables (following a procedure associated with the name of Dirichlet) 
by writing 

vvi 1 - >Va 4 - . . . + XiX2 . . . X*-,, 1 (/ =-1,2,..., k), 

so that 


W 2 , 

dXxi, Xa, 




Wi ^1^2 . . . X/i, w^ x’lA'a . . . i(l (1 ~ 2, 3 . . ., k), 

and e'* — (wiWa . . 

The distribution then becomes 


(0<Xi < ^,0< 1,/ 4= 1), 


^ dx 

n — 

- X,) 

from which the result follows at once, since ~ JX. 

The characteristic function of / js now fairly easily found to be 

M(u\ - k-«^ <r(ix - ;«)}» r(iA:x) 

Calculation of the cumulants then leads at once to the x’ approximation for the distribution of 
t given by Bartlett ih 1937. 


• See also E. J. G. Pitman, Proc. Camb. Phil. Soc., 83 (1937), 212-222. 



SUPPLEMENT TO THE 


Journal of the Royal StatisticaL Society 

Vol. Vffl, No. 2, 1946 


Statistical Methods in the Selection of Navy and Army Personnel 

By P. E. Vernon 

[Read before the Research Section of the Royal Statistical Society, March 26th, 1946, 

Dr. J. WiSHART in the Chair.] 

Introduction 

The main theme of this paper is not $o much the statistical advances made during four years of 
psychological work in Navy and Army personnel selection as the makeshifts to which the statis* 
tician may be forced when his job is to extract from highly fallible data, in the shortest possible 
time, results which can immediately be practically applied. This admission does not mean that 
all our work was amateurish or statistically invalid. For though 1 cannot lay claim myself to 
any wide knowledge of statistical theory, the direction pf statistical work in D.S.P. (the Army 
Directorate for the Selection of Personnel) was in the competent hands of Mr. Patrick Slater; 
and both he and I have received frequent and most valuable guidance from Professor Cyril Burt, 
and from others, including Mr. Kendall and Dr. Fraser Roberts. The somewhat unorthodox 
techniques and short-cuts that I am going to describe must be attributed first to the highly abnormal 
type of material which was so often submitted for statistical analysis, and secondly to the shortage 
of qualified staff for coping with the huge volume of such material and the necessity for employing 
largely unskilled or only partially trained assistants for much of the tabulation and routine 
calculations. 

The following is a concrete example. One of the standard Army tests was a test of bodily 
agility and manual deftness, based on the time taken to transfer a series of iron rings from two posts 
to two other posts a standard distance away, running backwards and forwards at top speed. 
Evidence had been accumulating that this test was not very reliable and that its diagnostic value was 
poor for the purpose for which it was intended— namely, eliminating the least agile recruits from 
infantry and gunnery. Further, it had to be done in gym shoes, and owing to the rubber shortage 
the supply of these was running out. A decision was needed early in 1945 whether to with- 
draw the test or to substitute something similar. Now, a number of alternative physical tests, 
based on athletic and gymnastic exercises, were being used at an Army re-allocation centre. 
Several, such as the loo yards and the five-mile walk, were out-of-door tests, and it would have been 
unfair to apply them during wintry weather. The results of examination of 490 soldiers on these and 
on the agility test, together with age, height, weight, medical category and general intelligence 
score— 14 variables in all— at last arrived, barely a fortnight before the answer had to be given. 
The physical tests had been applied in four different camps, by different instructors, and it was by 
no means certain that conditions >wre identical throughout. Some rapid analyses of variance 
between camps showed that differences were indeed significant, but so small as to have only a slight 
effect on test inter-correlations. Many of the distributions were far from normal : for example, 
the 100 y^rds, and pull-ups to the chin, as shown in Table I. No assistants were available who 
were conversant with centroids or mean class values, and T had too much other work to calculate 
the 98 correlations by centroids myself. However, applying the centroid method to a few Specimen 
correlations yielded coefficients very little different from those given by ordinary product-moment 
technique. An assistant therefore tabulated and calculated the 9^ prc^uct-moment coefiicients in 
about eight days. 1 then carried out eight factorial analyses by simple summation (a modifk:ation 
SUPP. VOL. VlII. NO. 2 G 



140 


Vernon — Slathtlcal Methods in the Election of Navy and Army Personnel . [No. 2, 


Table I 

Specimen distributions of scores on personnel tests, training marks, and grades 


100 yards sprint 

Pull-ups 

Map-readtna-marks 
of motor-driving 
trainees 

Morse receiving 
scores for accuracy 

Percental grades 
awarded by two 
Officer Sclectipn 
Boards (N - 927 
and 901) 

Secs. 

f * 

No. 

f 

% 

f 

% 

• 

f 

Grade 


11 

4 

10 

55 

95 + 

42 

100 

12 

A 

4 

12 

52 

9 

19 

87 + 

53 

99-8 + 

15 

B 

56 

13 

126 

8 

43 

79+ 

65 

99*6+ 

12 

C 

15 

14 

104 

7 

50 

71 + 

46 

99*4+ 

10 

Fail or 

25 

15 

70 

6 

60 

63 + 

25 

99*2+ 

6 

put back 


16 

40 

5 

60 

55 + 

16 

99*0+ 

6 



17 

27 

4 

77 

47+ 

7 

98*8 + 

6 

A 

12 

18 

27 

3 

55 

39+ 

9 

98*6+ 

2 

B 

37 

19 

11 

2 

34 

31 + 

2 

99*3 

2 

C 

4 

20 

13 

1 

14 

30- 

1 

98*0 

2 

Fail or 

47 

21 

0 

0 

23 



97-8 

3 

put back 


22 

5 




266 

96*7 

1 

, 


23 

6 


490 



96*0 

1 



24 

0 





93*7 

1 



25 

3 









26 

2 






79 




490 










f ^ frequency of corresponding score. 


of Thurstonc’s technique), this number being needed to yield communalities by successive approxi- 
mation, and in order to find the effects of holding Age, Intelligence, Height and Weight constant. 
I thus obtained the answer in four more days : namely, that the nine physical tests do measure a 
fairly prominent general factor with group factors common to the running tests, the jumping 
tests and the heaving tests; but that the standard Agility Test is very poorly saturated with the 
general and with the more specialized types of physical capacity, so that the proposed tests could 
not be substituted for it. Obviously, the statistical analysis is open to criticism on the grounds of 
unreliability of data, illegitimacy of applying product-moment correlation and factor analysis to 
non-normal distributions, and so forth. But would any other approach have given a usable result 
in the time, and is it likely that such a result would have been appreciably different from mine? 

This was perhaps an extreme case, but the urgency and the shortage of skilled assistance were 
ever-present circumstances. True, the Army D.S.P. possessed an excellent statistical department 
in which, at times, as many as thirty people were working, and the Senior Psychologist of the Royal 
Navy had a rather smaller organization. Though electric calculating machines were provided, 
it was only rarely that Hollerith or other mechanical tabulating devices were available for our 
work. During the past four years 1 must have calculated some 8,000 correlation coefficients of all 
types, with a median population of about 200 recruits, and I did the tabulations myself for half of 
them. In addition, there were roughly 1,000 factor analyses, multiple correlations, corrections 
for multivariate selection and analyses of variance. I should add that I thoroughly enjoyed all 
this,* together with the other aspects of my work, such as test construction and training investiga- 
tions. But it helps to explain why my paper is mainly concerned with short-cuts, unsolved problems, 
and practical applications, rather than with any fundamental contributions to methodology. 

Types of data 

It is desirable first to outline the kinds of data with which we were chiefly concerned, since the 
peculiarities of the distributions largely determined the peculiarities of methods of analysis. 
Although the abilities and attainments and other qualities of the recruits were probably normally 
distributed in the great majority of instances, our measurements of these variables were fre- 
quently abnormal. Five main ^^s may be mentioned. 








194Q Vernon — Statistical Methods in the Selection of Navy and Army Personnel 141 

(i) Test scores. Tests of general intelligence and of practical, mechanical, spatial and other 
similar aptitudes can, of course, by suitable selection of items, always be made to yield normal 
distributions in unselected groups. But it is considerably more difficult to devise good hard items 
than average or easy items ; a difficult reasoning problem or a mechanical item is apt to become 
unduly specialized or intricate. Moreover, the time limit for a test has to be lengthened if the top 
end of the scale is to be covered adequately, and this is a grave disadvantage when the total time 
available for tests is severely restricted. From the standpoint of selection, also, we are more in- 
terested in discovering the poorest than the best men. Thus, for many reasons, tests which give 
negatively skewed distributions of scores are the most practically useful. 

Educational tests measuring verbal or mathematical abilities, on the other hand, always tend to 
yield positively skewed distributions. Roughly one-quarter of the population left school before 
14, or were below the top elementary class if they left at 14 ; about one-half reached the top class and 
left at 14 ; only one-quarter received any secondary education, only 6 per cent, achieved School 
Certificate standard, and less than 2 per cent, reached University. But the men and women with 
higher education do vastly better than the other 75 per cent, on tests involving education. 

These, and certain other minor, reasons explain why it was the exception rather than the rule 
for our standard tests to yield normal distributions. Two of the best Army tests, in fact, actually 
gave an almost rectangular, and a slightly U-shaped, distribution. 

(ii) An important part of our work was item analysis of new tests. With very few exceptions, 
the tests were objectively scored, so that recruits could only obtain a right mark or zero on each 
item. Item distributions were therefore 2-polnt ones, ranging from i or 2 per cent, right on the 
most difficult items, through 50, to 99 per cent, right on the easiest ones. 

(iii) A good deal of other information was collected about each recruit besides his test scores. 
Age distributions varied widely during the course of the war, according to the Ministry of Labour’s 
policy in reserving and de-reserving- older groups. Incidentally, this constituted a continual 
source of difficulty in establishing test norms. Educational standard was assessed, in the Army, 
on an 8 by 2 scale. The 2-scale referred to academic on the one hand versus technical or vocational 
on the other hand. The 8 points referred to the standard reached — University degree. Higher 
Certificate, School Certificate, top class elementary, and so on. As already mentioned, this scale 
is strongly positively skewed, and the 2-fold sub-classification added to the complexities of this 
variable. Civilian occupation was an important piece of information, which often needed to be 
classified under one of half a dozen or so broad groups, for statistical purposes. It is doubtful 
whether we arc any further forward in establishing a satisfactory psychological classification than 
we were in 1941. Army recruits were, in addition, assessed for certain character qualities, such as 
aggressiveness or lack of it, for potentiality as future officers, and for liability to nervous breakdown 
(the latter being based on an interview by a psychiatrist). Such data generally yielded 2-point 
distributions, with about 5 per cent, of recruits in one category, 95 per cent, in the other, or 
perhaps a 3-point distribution with 5, 90 and 5 percent, respectively. Finally, the recruits recorded 
on a questionnaire their spare-time interests and any useful experience they had had, in industry, 
motor-driving, scouting, and so forth. Here, too, the distributions were generally 2-fold, and 
ranged, for example, from about 80 per cent, stating that they liked reading and football, to less 
than zo per cent, liking acting and radio repairs. 

(iv) The criteria, against which the value of our tests and of selection procedure in general were 
gauged, often consisted of marks awarded during training courses. Sometimes such marks were 
badly skewed (two specimens are given in the third and fourth pairs of columns in Table I), and 
more often than not they showed variations in standard from one class or squad to another. When 
the variations were very large it was necessary to express the marks as relative to each squad. A 
conversion based on percentiles was usually more appropriate than turning the marks into sigma 
scores. For example, one quick, though crude, method was to reduce the marks to a 3-point 
distribution, and to put the top third of any squad in the top category, regardless of the absolute 
level of the men’s marks, the middle third in the second, and the bottom third of all squads in the 
last category. 

When the trainees received several sets of marks it was often found that, although the Training 
School ad^ the whole lot together, they were better grouped, on the basis of factor analysis, intb 
two or three distinct sets. Thus among signallers and telegraphists the marks for Morse and other 
practical work showed little agreement with marks for theory [and bookwork. In one instance 



142 Vernqn— Methods in the Selection of Navy and Army Persxmnet tKo, % 

marks on two theory examinations ranged from about 15 to 95 per oent.» Morse and other practical 
marks from about 75 to xoo per cent. The Training School solemnly added all these together to 
give a total mark out of 500, which was of course almost wholly determined by theory. Closer 
enquiry revealed, however, that the School took little account of its own totals and only failed those 
men who did not reach a sufficient standard in Morse. 

Clearly it was necessary to scrutinize any sets of marks submitted to us, and to break them down 
and recombine them in various ways before we could accept them as adequate criteria of the recruits* 
proficiency. Sometimes no marks were provided, only Pass or Fail results, and here, too, varia- 
tions of standards were troublesome. Thus in three Driver Training Schools failure rates over a 
considerable period were i6i, agj and 7i per cent., though we knew that the Schools were receiving 
trainees of just about the same average quality. The differences are highly significant (f-ratio = 
i 8‘37 with 2 and 1,085 degrees of freedom). In such cases we found it best to correlate tests with 
criteria separately in each group and combine the resulting coefficients through the z-transforma- 
tion. Another problem which was never satisfactorily settled was what to do with recruits who 
passed after taking part of the course over again. Their original marks might be below the border- 
line of, say, 50 per cent., but on re-sitting they might obtain 70 per cent. My own solution was 
usually to group these men in a single category between those who passed at their first attempt and 
those who failed outright. 

(v) Finally, it was often necessary to use gradings or assessments of the recruits’ efficiency in the 
absence of any reliable examination results. Thus among thousands of seamen, instructed in 
classes of 30, the instructors picked out for us the top 3 and bottom 3 men who, in their opinion, 
were most and least likely to make good seamen. Here, then, we had a 3-point distribution — 
10 per cent, of bests, 80 per cent, ungraded in the middle, and 10 per cent, of worsts. 

In many investigations we were able to get the instructors or officers to grade on a 10, 20, 40, 
20, 10 per cent. — that is, a 5-point — scale. Such assessments are, of course, purely relative ones. 
But the danger of accepting absolute gradings from Service officers is shown by the two distributions 
compared in the last column of Table I. Obviously the standards corresponding to A, B, C and 
Fail are quite different at the two Boards. (It should be added that these figures were collected 
before the present, more scientific, system of officer selection was instituted.) 

Several other techniques have proved useful. A.T.S. officers and instructors are often willing 
to rank their squads, particularly when the names are given them each on a separate card so that 
^y x»n be shuffled about. Up to 30 can be ranked at a time, and these rank orders then converted 
into sigma scores. Sometimes it is better not to ask for gradings, but for a psychologist to hold a 
conference with the officers who know the men best, to discuss each man’s qualities and defects, 
and for the psychologist himself to decide which men are, say, outstanding, above average, average, 
l:«low, and definitely poor. But the most widely used technique consists in splitting the concep- 
tion of efficiency into a number of descriptions of specific qualities; anywhere from 6 to 30 
qualities have been used. A conference is held and each recruit is given a -j , 0 or — grading on 
each of the items after discussion among the officers who know that recruit. A representative of 
D.S*P. i^ides the conference and sees that the most junior officer or N.C.O. present always gives 
his opinion first. The method is quicker than it sounds, and it yields very satisfactory data from 
the statistical viewpoint. For the items can be inter-correlated and submitted to factorial analysis ; 
scores on the most useful ones ban then be combined to give a really adequate distribution of 
proficiency. 

Correlational techniques 

From the preceding section it may be seen that the variables to be inter-correlated were seldom 
suitable for product-moment technique, though sometimes the test scores, gradings, etc., could 
be redistributed into approximately normal form. Product-moment correlations were nevertheless 
used more than any other, because of their greater reliability. Yet they are not always the 
mos$ appropriate for selection investigations, since regressions are frequently non-linear. For 
efampl9. a test of aptitude for Morse correlated fairly well with subsequent Morse proficiency at 
the top end of the scale, but not at all at the bottom end. And as this test was used primarily 
fot eliminating men unlikely to learn telegraphy, a product-moment coefficient based on the 
whole r^ge of scores was defimtely misleading* Another consideration was that our experimental 
^pulations were often very large, so that product-moment correlations took a long time to tabulate 
quicker, even if less reliable, types of correlation were perhaps justifiable. 



^ t5|^ / Methods in the Selection of Na»y mi Army F^sMtd 14} 

How were we to deal with our numerous 2-^ 3- or s^point distributions? Some $tatistician$ 
would claim that tests of significance of association would be the only legitimate form of analysis.* 
But is very awkward to handle when one wishes to try the effects of combining several tests into 
a diagnostic battery. One makeshift was to convert contingency; then a simple 

correction based on numbers of categories yields a correlation ranging from o to x*o, which is 
fairly closely comparable to a product-moment coefficient.f But it is obviously undesirable to use 
such coefficients in multiple correlation and factor analysis calculations. A far better solution is to 
turn the distributions into centroids or, in Kelley’s terminology, t mean class values. Correction 
for coarse grouping is automatically allowed for, by substituting a for each a* in the denominator. 
The technique is a lengthy one, though the centroids themselves can be read off with sufficient 
accuracy from a table ; § nevertheless it is preferred now for all our more important calculations. 

When an extended distribution was to be compared with a 2- or 3-foId one, I generally resorted 
to biserial correlation. This is, however, inappropriate for correlating, say, test scores with Pass- 
Fail results, since it is based on the wrong regression line. We do not require to know how much 
poorer at our tests arc men who fail their training than men who pass ; rather we wish to find how 
much worse low test-scorers do in their training course than high scorers. The method is more 
suitable therefore for comparing, say, spare-time interests (a 2-foId distribution) with course marks 
(an extended distribution), also for finding the consistency of separate items in a test by correlating 
each with the total score on that test. 

A useful extension of biserial r was worked out for us by Professor Burt for dealing with our 
results from bests and worsts. Given, for example, nothing but the test scores of the most and the 
least efficient 10 per cent, of recruits, the following is one of the formulae at which he arrived : 

^ Mb- _ 

r bia . 

Mb and Mw are the means of the bests and worsts, a the standard deviation of their combined 
distribution. Such coefficients, however, are poor in reliability, their Standards Error being rather 
larger than those of product-moment correlations based on a population of the same size as the 
bests and worsts combined. In most of our investigations, therefore, we did our best to obtain 
fuller information about the middles, and not only the two extreme groups. 

A disadvantage of biserial correlation is that when, as so often happened, the extended distri- 
bution is skewed, the coefficients are apt to become exaggerated; values greater than 1*0 were 
sometimes obtained, and this led to the method being banned in D.S.P. Nevertheless, 1 still con- 
sider it a valid and useful technique provided the distributions can be normalized. 

The alternative is correlation ratio, which is very readily calculated by analysis of variance. 
It is, of course, merely the square root of the ratio of sum of squares between groups (e.g., between 
Passes and Fails on a particular test item) to total sum of squares (of test scores). A serious defect, 
however, is the dependence of 7), the correlation ratio, on the proportions in the two contrasted 
groups. Thus w^ cannot compare the consistency ratio of a test item passed by only 5 per cent, of 
recruits with that of an item passed by 50 per cent, of recruits, since it can be shown that the maximum 
value of the correlation ratio is only about one-half in the former case what it is in the latter. My 
own preference in item analysis is for the simple x** but members of D.S.P. staff have carried out 
extensive studies with correlation ratio and other techniques which will, I hope, be published soon. 

Often the distributions on both the inter-correlated variables were so restricted or irregular that 
the scatter was reduced to 2 x 2, and a tetrachoric correlation was obtained. The calculation of 
tetrachorics by means of Pearson’s Tables is extremely tedious. Professor Burt provided us with a 
convenient trigonometrical approximation, but even this was beyond the scope of most of our semi- 
trained assistants, so that 1 was led to explore the possibilities of graphical methods. Thurstone || 

• Cf. Chambers, E. G., “ Statistics in Psychology and the Limitations of the Test Method,” British 
Journal of Psychology^ 1943, Vol. 33, pp. 189-199. Also: ” Statistical Psychology: A Plea for Scientific 
Method,” Paper read to the British Psychological Society, September 22nd, 1943. 

t Cf. Guilford, J. P., Psychometric Methods (McGraw-Hill, 1936), p. 359. 

t Kell^, T. L., Statistical Method {Ut^ernmn, 1924), pp. 168-171. 

§ Garrett, H. E., Statistics in Psychology and Education (Longmans, Green, 1937), Table XXVII. 

ji Thurstone, L. L., et a/.. Computing Diagrams for the Tetrachoric Correlation Coefficients (Vniseniiy 
of Chicago Bookstore, 1933). 



144 Vernon— S/ flT/wr/cfl/ Methods in the Selection of Navy and Army Personnel [No. 2 , 

points out that tetrachoric correlations are in any case so unreliable that it is nnerely misleading to 
determine them to more than two places of decimals. Unfortunately, four variables are included in 
a tetrachoric scatter such as Table II, namely, p, p\ d (or any other cell frequency), and 
coefficient. As graphs cannot conveniently delineate more than three variables, some restriction 


Table II 

Scatter for a tetrachoric correlation 



High 

Low 


High 

1 a 

h 

P 

Low 1 

1 

\ c 1 

1 1 

d 



P' 

q' 

1-0 


is unavoidable. Thurstone’s well-known Computing Diagrams get over the difficulty by holding 
p (or p') constant in each graph, and providing a series of graphs for the values of p : 0-50, 0*49, 
0*48, etc. Thus interpolation is required between two graphs if p ^ 0*485, for example. An 
alternative method which I devised was to find the geometric mean of p and p' — say p'\ A single 
graph will then cover ail values of p'\ d and /*,. When p and p' are nearly equal, this graph alone 
will suffice; but when they are unequal, a second graph supplies an approximate correction. If 
the inequality is very great, as when p = 0-5 and p' — 0*2 or less, the correction is not accurate 
enough. Eventually, therefore, a method was adopted for which Professor Burt is again responsible. 

Burt’s method consists in calculating ajp and c!q. He has shown that for any given value of these 
two ratios, r^ is but little affected by alterations in the fourth variable, namely p'. Hence instead 
of Thurstone’s forty odd graphs, only four graphs are needed to cover the same range. These were 
drawn for p' --- 0 50, 0*34, 0*21 and o*io. As an example of the slow rate of change of with p\ 
when alp = 70 per cent, and cjq — 30 per cent., r, as read off from the four graphs is o*6o, 0 57, 
0*55 and 0*50, respectively. Thus interpolation between different graphs is seldom needed, and 
the average tetrachoric correlation can be worked out and read off by a statistical assistant in about 
two minutes. 

With so quick a method, more than one correlation can often be determined for the same correla- 
tion scatter, and the two or more values combined. True, we do not know the Standard Error of an 
average r„ but its reliability is certainly greater than that of a single coefficient, and seems to be not 
much inferior to that of a product-moment correlation. I have sometimes applied this method 
even to extended distributions where product-moment correlations would have been feasible, in 
order to save time. Usually I split the.distributions close to the 67th and 33rd percentile levels — 
that is, I compared the top, middle and bottom thirds on each variable, and averaged the two values 
of r,. In one investigation, 1,200 Army boy tradesmen had taken between ii and 18 different 
selection tests. They had been trained for various trades — fitters, carpenters, electricians, and so 
on— in small groups under different instructors. Had I correlated all the tests with the later 
training results by product-moment technique, at least 2,000 coefficients would have been needed. 
That meant that the work would never have been done at all, as no one had enough time. But by 
treating all test scores and training marks as relative, and simply tabulating them in three categories, 
numerous small classes of boys could be combined. The total number of coefficients dropped to 
900, and I was able to do all the tabulation and calculation, almost single-handed, in a fortnight. 

Factor analysis 

Factor analysis constitutes a fascinating occupation, with perhaps as strong an aesthetic appeal 
as any branch of statistics. But in selection work its practical value is decidedly limited. It is 
genuinely useful for classifying sets of marks or grades awarded by Training Schools, as in the example 
cited above of Morse and bookwork among telegraphists. The other main sphere of application 
is in the development of new tests, where it helps to determine what abilities are measured by these 
tests in terms of factors defined by old-established tests. Knowledge of the factor content of a new 
test does not, of course, tell one much about its usefulness for predicting efficiency in naval or army 
jobs. Nevertheless, as a general rule, no test is likely to be vocationally valuable unless its commu^ 
nality is high when it is factorized as a member of a varied battery of tests. For example, a mechanical 



145 


1946] Vernon — Statistical Methods in the Selection of Navy and Army Personnel 

or manual test which does not overlap strongly with other mechanical tests, and does not obtain 
high general or group factor loadings, is unlikely to be of much use in the selection of mechanics. 

The factorial technique most commonly employed has been Thurstone’s Centroid technique,^ 
or what Burt t denotes as Simple Summation. But we have seldom carried out rotation of axes 
as Thurstone does in an attempt to attain so-called “ Simple Structure ” — that is, a set of orthogonal 
common factors each accounting for lo to 20 per cent, of the test variances. For our results have 
pointed inescapably to a theory of mental structure more akin to that of Spearman than Thurstone, 
in spite of the fact that no Navy or Army psychologist or statistician was an ardent disciple of 
Spearman. Over and over again our test inter-correlations have yielded a general factor running 
through all the tests and covering 30 to 40 per cent, of variance, together with two, three or more 
group factors, each confined to a few tests and collectively covering less than 20 per cent. 

TTiurstone’s technique, with little or no rotation, provides a good indication of what group fac- 
tors are present; and my own practice has been to go on from here by Burt’s, Holzinger’s t or 
other group-factor techniques in an endeavour to determine general and group factor loadings 
which would, when multiplied out, duplicate the original correlation matrix as closely as possible. 

No doubt the main reason for the difference between our conclusion and those of Thurstone and 
other American psychologists is that we have been working with such heterogeneous populations. 
Probably we have had at our disposal more representative samples of the whole adult population, 
male and female, than any pre-war factorist ever had. In contrast, Thurstone has chiefly worked 
with highly selected groups of University students and secondary-school pupils. We have obtained 
ample confirmation of the point, which Thomson has made so clearly, § namely, the distortions 
introduced into the factors by selectivity among some, or all, of the tests. 

Much light has been thrown on the nature of the group factors which, although more prominent 
than Spearman himself realized, are, as I have stated, of very limited magnitude. A remarkably 
consistent grouping of tests into two main types — verbal, arithmetical and educational on the one 
hand, versus practical, mechanical and spatial on the other hand — recurs again and again. Not 
only among unselectcd soldiers and sailors, but also among skilled tradesmen, officers, even among 
African natives, similar group factors emerge. Curiously, however, this patterning or structure of 
abilities seems to be less clear-cut among women. These broad group factors are, in addition, 
found to break down or sub-divide indefinitely into smaller types whenever suitable batteries of 
tests are analysed. Thus arithmetical, scientific, literary and clerical sub-group factors have been 
discovered within the verbal-educational complex, and the mechanical-spatial complex has yielded 
relatively distinct mechanical information, manual, and visuo-spatial sub-factors. No other group 
factors have been found with such wide scope as these two, but a physical capacity factor was 
mentioned at the beginning of my paper, and other minor ones have suggested themselves when 
batteries of more specialized tests have been studied. 

Multiple correlation 

As Thomson H has shown, multiple correlation technique is more important in selection than 
factor analysis, since it enables us to calculate how several tests will work in combination. Thus 
the general intelligence test which has been the most widely used in the Navy and Army, having 
been applied to some three million men and women, is gradually falling into disuse or being omitted 
altogether, since we have found so repeatedly that a battery of verbal-educational and spatial- 
mechanical tests covers the same ground more adequately. In combination with them the general 
test usually has a zero or slightly negative p coefficient. Again, in trying out any new test for a 
particular job — say motor-drivers, mechanics. Radar operators — the validity of the test itself is a 
secondary consideration. The chief point to be investigated is whether this test adds enough to the 
standard battery, or raises multiple r sufficiently, for the time and trouble it involves to be worth 
while. 

Now, most psychologists would suppose that multiple correlation would tell us how to weight 

♦ Thurstone, L. L., A Simplified Multiple Factor Method (University of Chicago Press, 1933). 

t Burt, C. L., The Factors of the Mind (University of London Press, 1940). 

t Holzinger, K. J., Student Manual of Factor Analysis (University of Chicago, Department of Educa- 
tion, 1937)." 

§ Thomson, G. H., The Factorial Analysis of Human Ability (University of London Press, 1939). 

II Op. cit. 



146 Vernon — Statistical Methods in the Selection of Navy and Army Personnel (No. 2, 

our tests in order to predict efficiency in a variety of jobs. For instance, verbal tests are likely to be 
most relevant in selecting clerks, mechanical-spatial tests in selecting mechanics,.and the P coefficients 
should show us how the tests in our standard battery can be most effectively combined. We have, 
indeed, with this end in view, determined literally hundreds of multiple correlations, using efficiency 
at different Navy, Army or A.T.S. jobs as the criterion. Here again Professor Burt has assisted 
us by introducing what he calls the techniques of Single Division and Complete Division, which are 
considerably simpler than Doolittle's, Aitken’s and other methods. The method of Complete 
Division can be applied even by assistants who have not sufficient training to work out standard 
deviations and product-moment correlations. 

Most of our validation has to be done with selected groups — that is, we can only find out if our 
tests for, say, mechanics, are working well by following up the careers of mechanics who were 
originally chosen largely on the basis of these tests. Thus it has become standard procedure not 
only to determine the multiple correlation, but to correct the original validity coefficients and the 
multiple correlation for selectivity or undue homogeneity in the population, adopting the technique 
of correction for multivariate selection which Burt has described.* 

The more widely these methods have been used, the less satisfactory do they appear, and the 
more difficulties they raise — difficulties which, unfortunately, we have never had the time to sit 
down and consider seriously. In the first place, working out all the inter-correlations or covari^ 
ances of the battery of selection tests in each group of recruits is exceedingly lengthy. Moreover, 
some of the groups, engaged in rather unusual jobs, may be unavoidably small, even less than loo, 
and this means that the inter-correlations are far from reliable. On occasion, therefore, I have 
preferred to take over the covariances from another, larger group — provided that its means and 
variances are much the same as those of the actual experimental group, both because these larger- 
group covariances are more reliable, and because of the consequent saving in time. Yet this make* 
shift is obviously open to criticism. 

Next, it is not always realized that with smallish experimental groups, multiple r is inevitably 
exaggerated. For when the validity coefficients have considerable Standard Errors, some of them 
arc likely to be unduly large, others unduly small, and it is always the largest validities which 
determine the size of multiple r. 

Thirdly, multiple correlation often gives the misleading impression that certain tests are much 
more valuable than others. Unless the experimental population is very large, the p coefficients 
have low reliabilities, and it can easily happen that a test which is apparently valueless in one group 
of subjects jumps to being one of the most valuable in another, parallel, group. This is especially 
likely to occur when the test battery contains three or four highly inter-correlated tests. All these 
may show similar correlations with the criterion, but the one which, perhaps by chance, surpasses 
the other two usually achieves a far higher p coefficient. It is simply not true that this test is very 
valuable, the others valueless. More reliable predictions of efficiency would certainly be obtained 
from all three. 

Thus multiple correlation technique is, in fact, too efficient, since it shows how the very best 
predictions can be obtained in the particular experimental group. Only if this group is extremely 
large, and if other factors, such as the quality of intakes of recruits and the nature of the job and its 
training, remain constant (factors which are apt to vary rapidly), would it be justifiable to weight 
the battery of selection tests strictly in accordance with the obtained (3 coefficients. Often, then, a 
much rougher procedure, based on Spearman's Correlation of Sums formul2e,t and on the 
habitual inter-correlations of the tests in other similar groups, will provide an equally good guide 
as to how the tests are likely to work in combination. 

The next difficulty is that we have not, like American Service psychologists, machines for 
weighting test scores, and it would take our officers and N.C.O.s or Petty Officers in the reception 
units far too much time to calculate weighted scores for numerous different jobs. Instead the 
Army has adopted the method of minimum standards on the most relevant tests. To be acceptable 
as a clerk, a recruit should be above a certain grade on the Clerical, Arithmetical, and other tests, 
but need not score highly on medianical tests; similarly with other jobs. Now, while this method 
does weight the tests, it does not duplicate multiple correlational findings. It is a substitute whose 

! “ Validating Tests for Personnel Selection,” Brik /. Psychol, 1943, Vol. 34, pp. 1-19. 

t Cf. Kelley, op. cit., Chap. IX. 



194Q Vernon— 5rar/j//cfl/ Methods in the Selection of Nayy and Army Personnel 147 

validity is definitely lower than that predicted by multiple r. Only in one, particularly important, 
job have we been able to use summed, weighted scores, and this procedure has broken down 
rather widely because many of the officers are utterly sceptical of negative weights. Nothing would 
persuade them that better predictions could be obtained by subtracting scores on certain tests from 
the summed scores on certain other tests. 

Note that minimum standards always imply a high regression coefficient at a certain point in the 
distribution. But I mentioned earlier that our regressions are frequently non-linear, so that a multiple 
correlation based on the whole range of scores may or may not be a true guide when it is a matter 
of cutting out the poorest lo or 20 per cent, of recruits. For these and other reasons the Navy 
seldom resorts to minimum standards, and prefers to judge a recruit's capacities from his score on 
all the standard tests equally weighted, though paying more attention to one or two specialized tests 
when considering him for a specialized job. In both Services, it should be added, selection or 
rejection is never a matter of test scores alone ; a recruit’s interests, previous experience, and tem- 
peramental traits as judged in an interview or assessed from his work record, are all considered 
together before his suitability is decided. 

This brings us to corrections for homogeneity or heterogeneity. The object of Burt’s tech- 
nique is to deduce, from the correlations of tests with a criterion in a selected group, and from the 
test variances and covariances in the selected and unselectcd populations, what would be the cor- 
relations with the criterion in the unselected population. Its value in vocational investigations is 
lessened, however, by the fact that selection is never based solely and explicitly on the battery of 
tests. In the Navy and Army various interviewers pay varying amounts of attention to the test 
scores, some being more influenced by the other considerations I have mentioned. Moreover, the 
variances in the selected group may be reduced not so much because selection was based on the 
tests as because other factors, whose selectivity we may not know, such as education and civilian 
occupation, have brought about indirect selection. An additional difficulty is that most tests do 
not show normal score distributions ; a positively skewed test, for example, may actually possess a 
higher variance in a superior, selected, group than in an unselected group. 

It must be admitted then that we are very rarely able to make adequate allowance for selection, 
and that a reliable picture of the worth of the selection tests could only be obtained by following 
up unselected men. Table III gives a good example of the effects of selection and of modifications 
in the criterion which occur only too often in everyday practice. A new mechanics branch was 

Table III 

Correlations between tests and job proficiency among the first 1,100 trainees 


Test No. 

First 300 

Next 300 

Next 500 

0 

0-623 

0-399 

0*255 

1 

0-465 ; 

0-568 

0-294 

2 

0-527 

0-352 

0-329 

3a 

0-553 

0-379 

0-164 

3b 

0-638 

0-426 

0-369 

4 

0-377 

0-282 

0-276 


started, and the first 300 men sent for training were almost unselected, as we knew little about the 
requirements. The six selection tests — their content does not concern us at the moment — gave 
remarkably high coefficients. With the next 300 our Selection Officers knew something about the 
job, and most of the coefficients fell .markedly ; but the syllabus also altered, and this, together with 
sampling errors, probably accounts for the higher validity of Test 1. As the Training School got 
used to the better quality of well-selected trainees, the syllabus was further stiffened, and in the third 
group of 500 the correspondence of our tests with the criterion is decidedly poor. 

It is not possible, in this paper, to discuss the intricate matter of fixing minimum standards with 
a view to achieving minimum wastage. It is desirable, of course, to cut out as many potential 
failures as possible with the smallest possible sacrifice of good men. In spite of considerable 
investigation, there are still many unsolved difficulties when one is dealing with selected groups, 
skewed distributions and imperfectly linear regressions. 


02 



148 


Discussion on Dr. Vernon^s Paper 


[No. 2, 


Analysis of variance 

Doubtless it will be asked why our work appears to have been based almost solely on correla- 
tion methods. Actually this is not the whole story, but I have omitted applications of x®, the 
determination of test norms, of means and group differences, and the like because they did not raise 
any novel points of particular interest. Analysis of variance and covariance has been widely used, 
though to a lesser extent than might be thought desirable, for the following reasons. First, the 
vocational psychologist has little opportunity to impose good experimental design. Masses of 
data are submitted, often in most inconvenient forms, and he must make the best of them. For 
example, the success of Army officers in the field is influenced by many factors, including age, length 
of service, the Arm in which they are commissioned, and it may, or may not, correlate with the 
grades assigned to these officers when they were candidates passing through the selection procedure. 
But it was quite impossible to collect even small groups of officers of each age, each length of service, 
in each Arm, and therefore impossible to estimate the variance contributions of such factors in 
combination. In such situations, partial correlation often yields a more hopeful approach. Other 
minor reasons are the comparative rarity of assistants who can carry out even routine analyses com- 
petently, the shortage of calculating machines which are vital here but are less essential for many 
correlational calculations, and lastly, the very great difficulty v hich the layman or partially trained 
selection officer has in grasping the meaning of an analysis of variance. Unfortunately, the sales 
value of our work had to be considered frequently, and many people who think (mistakenly) that 
. they understand correlations, shy at any mention of variance and F-ratios. 

Nevertheless there are numerous everyday applications where the data provide their own 
design — for example, the significance of variations between classes or instructors in training-school 
marks ; variations in the results of a test of vision when given in different testing centres ; variations 
in efficiency between men drawn from different broad occupational groups ; and so forth. A very 
common problem in multiple correlation was solved by Mr. Slater as follows. Recruits may be 
trained at two, three or more similar jobs, such as operating different Radar sets or firing different 
kinds of guns. A battery of selection tests is correlated with each type of proficiency, and it is 
found that the patterns of test validities are somewhat, though not exactly, similar. Does this mean 
that the tesSts should be differentially weighted, or accorded different minimum standards, for each 
job, or will a single set of weightings for all jobs give predictions which are insignificantly inferior? 
The technique is to calculate the multiple correlation in each job group, and in all groups combined. 
Analysis of variance then readily shows whether separate regression equations for each group cover 
significantly more variance than a single equation for all groups. 

Finally, we have occasionally been able to arrange our own investigations. Thus in one study 
of the use of films and filmstrips for training seamen, I was able to analyse the contributions of 
these visual aids, of good and weak instructors, and other minor factors, and to allow for variations 
in intelligence of the trainees by analyses of covariance. In the Army there have been studies of the 
effects of remedial physical training on physical status, intelligence and achievement, with due 
allowance for improvements due merely to practice at the tests ; and similarly the effects of special 
education courses for illiterate recruits have been analysed. The effects of menstruation among 
A.T.S. on test performance was a particularly intricate study. For these, and numerous other 
developments of analysis of variance, Mr. Slater was statistically responsible, and accounts of them 
will, I hope, soon be published. 


Discussion on Dr. Vernon’s Paper 

Professor Burt: 1 have great pleasure in proposing a vote of thanks to Dr. Vernon. As a 
member of the War Office Advisory Committee on Personnel Selection, I have been in touch with 
his work from its inception; and have always been astonished at the skill and ingenuity with 
which he has attacked the new and innumerable problems to which his work gave rise. To-day he 
has presented us with a review which has been, 1 am sure, as deeply interesting to statisticians as it 
has been to psychologists. The questions he has raised are by no means matters of the past ; they 
are likely to be of much importance in the work that awaits us in the immediate future. As he has 
pointed out, the statistical methods taken over for work on personnel selection during the war had 
larmly been elaborated during earlier investigations carried out in the field of child psychology ; 
and it has been most encouraging to those of us who had some small hand in developing those 



Discussion on Dr, Vernon*s Paper 


149 


1946] 


methods to leam how successful they have been in these newer tasks. But, in much the same way, 
the further experience gained in these war-time applications will in turn be of special value in our 
new peace-time problems — in education, in industry, and in commerce. 

Those of us who attempt such studies in the laboratory suffer from one great disadvantage. 
The samples that we are able to test are either very small or else highly selected — usually both : 
they consist, as a rule, either of children at school, or of students at the University. Dr. Vernon, 
on the other hand, has been able to analyse data obtained from exceptionally large samples of 
unselected adults. It has therefore been particularly gratifying to hear how the broad factorial 
hypothesis, put forward thirty years ago, on the basis of a few experiments on children tested in 
schools, has been in the main confirmed by his more recent and extensive studies : the hypothesis 
was that mental abilities include both a very general factor, popularly termed “ intelligence,” and 
a number of more special abilities, technically termed ” group-factors.” Spearman doubted the 
existence, or at any rate the importance, of the group-factors ; Thurstone doubts that of the general 
factor. Dr. Vernon finds both : ” over and over again ” he has (he tells us) encountered ” a general 
factor running through all the tests,” and, in addition, ” two, three, or more group factors,” closely 
corresponding to those originally discovered in the educational field. 

On the other hand, he deplores the fact that, unlike the laboratory psychologist, he has rarely 
been able to plan his experiments on new techniques. Here, therefore, the laboratory psychologist 
may still be of service. Indeed, any suggestions that Dr. Vernon can put to us for further work 
in this direction will be most welcome, particularly in regard to the development of less cumbersome 
statistical procedures. 1 note, for example, that in factor-analysis he has found that the newer 
” centroid ” method, developed by Thurstone in America, has proved more convenient in practice 
than the older technique of ” simple summation " developed in this country. It would, therefore, 
be most helpful if he could tell us (perhaps on some later occasion) in whal respects the former has 
been found more speedy or effective, especially as his use of Thurstone's technique has not ap- 
parently led to Thurstone's own conclusions, but rather confirmed the older views. 

But to-day, I imagine. Dr. Vernon, like myself, would rather hear suggestions and questions 
from others present at this meeting. He has, it seems to me, raised two groups of questions for the 
expert to answer : first of all, how far does the statistician consider these quicker methods, adopted 
to meet the exigencies of practical work, to be valid and trustworthy ; secondly, how far can we 
trust the more elaborate and refined statistical techniques, when, owing to the peculiarities of the 
data, the conditions which they presuppose are not strictly fulfilled? Perhaps to a large extent the 
answer is to be obtained, not so much by theoretical discussion as by a study of information gained 
in this way from practical experience in the practical field. Dr. Vernon himself has described many 
of these practical trials — trials carried out on a scale quite unheard of in previous psychological 
researches ; and he has described them with admirable lucidity. We shall all look forward to the 
moie detailed accounts that he has promised as soon as war-time restrictions are relaxed. Mean- 
while, we are most deeply indebted to him for this preliminary review; and it gives me the greatest 
pleasure to have the privilege of moving the vote of thanks. 


Dr. Fraser Roberts : 1 have very great pleasure in seconding the vote of thanks to Dr. Vernon 
for his admirable paper, because it has been my privilege during the War to sec something of the 
work that he and his colleagues have been doing. And very fine work it has been, work of great 
interest and value, and now that security res'trictions are being lifted one hopes that a great deal of 
it will be published in the Journals and become generally available. 

Dr. Vernon has made very clear the difficulties under which psychologists have worked. They 
were, after all, pitchforked into the urgent practical business of personnel selection after the War 
had started and while expansion was very rapid, so that again and again, as Dr. Vernon has explained 
to you, they had to sacrifice the possibility of leisurely long-term experiments in order to cope with 
urgent, practical, everyday problems. It is a thousand pities that all this work could not have been 
started long before. It is greatly to be hoped that the lesson has now been learned, and that people 
like Dr. Vernon can go on with their work under peace-time conditions and be able to plan their 
experiments as they would wish. I am sure that, admirable as have been the results obtained in 
war-time, the results of the future will surpass them. 

It is perhaps with the future in mind that I should like to say a word about his observations on 
multiple correlation techniques. In attempting to devise tests for selection, one can consider the 
problem at two stages. The first is 'to attempt to predict after, say, an hour’s testing what the 
practical man will discover, say, six months later. This presupposes that one accepts the practical 
man’s examinations or judgments of ability as one’s criterion. The further stage is to review the 
whole procedure of examination and practical judgment as well ; that is, to attempt to improve the 
practical estimation of success. In the circumstances of the War it was seldom possible to reach 
stage two : that is clearly long-term planning, and one hopes it will now be undertaken, but as re- 
gards the first stage, need one be worried by disagreements between practical men? I doubt it. 
If some sailors are classified as best by some observers and worst by others, which sometimes 
happens, if there is no correlation between examiners’ judgments, then there is no problem — there 
is no di&rence between a good sailor and a bad one and there is nothing to predict. But if, as is 



150 Discussion on Dr. Vernon* s Paper [No. 2, 

usual, there is some agreement, and there may be high agreement, then the technique of multiple 
regression, or more broadly of discriminant function analysis can be used to select the best tests and 
allot to them the best weights. 

I cannot feel that Dr. Vernon’s finding that in one sample test “ A ” contributes heavily and 
test “ B ” \ciy little, while in another sample this is reversed, is any criticism of the technique, which 
has, in fact, indicated the correct answer. “ A ” and “ B ” are highly correlated ; it matters little 
which we use, and the use of both adds little to what is given by one alone. 

An empirical approach may lead to the discovery of some unsuspected contribution ; there have 
been instances of apparently very bizarre measurements which, though relatively inefficient by 
themselves nevertheless contributed effectively to a battep', because these additional tests were 
measuring something which the other tests were not. One is, of course, up against the old difficulty 
of finding psychological tests which are not going to measure the same thing over and over again. 

Again, the peculiar distribution sometimes encountered may yield to methods of transformation 
and, of course, it is always possible to use more than the simple linear components. Even brutal 
normalization may not be unjustifiable sometimes. Above all, 1 remember a remark by Professor 
Fisher: “ When you doubt your metric, use rank order.” 

I hesitate to mention my own trifling experiments, but I had the opportunity on one occasion of 
getting instructors independently to place in rank order classes as big as fifty, simply asking them to 
place the subjects in the order in which they would choose them for service with themselves in a 
ship. Although the instructors had emphasized their different points of view, the correlations 
turned out to be 0*65 or somewhat more. Starting with such a criterion, one can then build up 
the test battery. I do feel that any attempt to split up the practical men’s grading into a number of 
components is difficult because, of course, one has to select the man as a whole. 

With those few observations I should like to say once again how very much I have enjoyed 
listening to Dr. Vernon’s Paper, and T am sure you all realize what an immense lot of fine work 
there is behind it, and, as 1 said before, 1 hope that we shall see much of it published in due 
course. 

The Chairman read a letter received from Mr. Alec Rodger, Senior Psychologist at the Ad- 
miralty, in which he expressed his regret at being unable to attend ; he would have greatly liked the 
opportunity of paying tribute to the work Dr. Vernon had done for them all at the Admiralty. 
He felt sure that no one else could have played the same technical advisory role with such quick 
understanding and impartiality. 

Mr. Patrick Slater said he was glad to he able to corroborate or amplify some of Dr. Vernon’s 
points. The paper gave a very just account of the statistical problems encountered in selecting 
army personnel and of the methods adopted for treating them. He, Mr. Slater, could not speak of 
problems concerning navy personnel. 

The most striking difference between the conditions under which their investigations were 
conducted and those which obtain in peace-time was one that he would describe by saying that 
under the conditions in which they worked their reasoning could be conducted in an enclosed 
universe of discourse. The strongest impression he retained from his army work was that of the 
elegance and simplicity which could be introduced into^ psychological argument when the universe 
of discourse in which it was conducted could be enclosed. 

An instance would explain what he meant. Suppose it was necessary to select men who could 
be expected to succeed as motor mechanics. After observing what motor mechanics do, and 
excogitating upon it, they might advance a theory that superiority of mechanical aptitude was one 
of the characteristics which differentiated successful from unsuccessful mechanics. This theory 
might lead to endless controversy, over such questions as whether mechanical aptitude was a unitary 
trait, what were the best means of assessing it if it existed, and so on. Controversies of this kind 
were demonstrably endless when they concerned traits underlying human behaviour, /.c., psychologi- 
cally postulated traits treated by'common consent as outside the range of direct observation. 

These controversial questions need not arise in an enclosed universe of discourse. Taking 
the two propositions : — 

firstly, that degree of success as a motor mechanic depended on amount of mechanical 
aptitude, and 

secondly, that amount of mechanical aptitude was measured by score on a certain test, 

without attempting to establish the truth of either, the conclusion was that an individual’s probability 
of success could be inferred from his test score. To verify this conclusion experimentally, they 
followed up a group of men whose test scores were known, and obtained assessments of their 
success. The results could be used to prepare scales showing the probability of success of individuals 
with different test scores. 

The experimental condition which enclosed the universe of discourse, was that the methods of 
observation used on the follow-up were those used in the selection procedure. This condition 
could be satisfied in a large unit like the army, but not in units as small as psychological clinics 
usually were. It involved standardizing methods of observation and maintaining with adequate 



J94Q Discussion on Dn Vernon* s Paper 151 

authority a central organization for research. He would generalize by saying that the advantages 
of reasoning in an enclosed universe of discourse should taken whenever possible. 

If the investigators habitually conducted their arguments in terms which they had designed and 
standardized for their own convenience, they found they could change the i^rspective in which they 
viewed many psycholomcal problems. Precise observations about individuals were required in 
order to estimate with the smallest possible amount of error what their expected behaviour is under 
precisely defined conditions. The statement that a man had a score of 48 or S.P. Test 1 meant 
more than the statement that he had l.Q. of 120; for the former specified the test which had been 
used and the latter did not. The statement that he was a man of superior intelligence meant even 
less, for it did not imply that any controlled method of observation had been used. “ What is 
the nature of intelligence? — or of will-power, courage or any other hidden mental trait? ” — often 
described as the fundamental problems of psychology — are questions which there had been no 
need to consider. 

'fhe use of standardized methods of observation led also to a change in methods of statistical 
analysis. An investigator who designed a psychological test for a private research could modify 
it to his liking, add new items, discard old ones, and make it longer, shorter, easier or harder. The 
variance observed in the scores of a group of individuals might mean little to him because the scale 
on which he measured it was arbitrary. 

But once the scale was fixed, measurements in terms of test scores could be treated like measure- 
ments in terms of inches or ounces or seconds, gaining the advantages obtainable by maintaining a 
conventional notation. In fact, the investigators accustomed themselves to considering the relative 
homogeneity or heterogeneity of different groups of men, to using regression coefficients when once 
they would have used correlations, and to making much more use of analysis of variance. On the 
other hand, factor analysis, although a favourite technique of psychologists, was one for which 
they found relatively few applications. 

He took this opportunity of calling attention to an approach to problems of selection which 
he believed largely original. Using standardized methods of observation, records could be pre- 
pared of a representative sample of the individuals from whom some were to be chosen, and of a 
representative sample of individuals engaged in an occupation to which some were to be directed. 
Hence the discriminant function could be found, and with it the multiple regression equation which 
described the appropriate method of selection in terms suitable for application. If several occupa- 
tions were to be considered simultaneously, the same methods of observation could be used for all, 
and the samples drawn from different occupations could be treated both in combination and 
separately, and thus classified into the groups for which similar regression equations were applicable. 
This, in his opinion, was the general form of treatment most suitable for solving problems of 
selection. It was based on the assumption that the average man engaged in any occupation was 
neither too good nor too bad in his work. This assumption appeared to be reasonable. 

If the data were treated in this manner there was no need to collect proficiency ratings — ratings 
which were often subjective and had little relevance to the specific requirements of particular occu- 
pations. Furthermore, the regression equations we obtained enabled us to take account of the 
alternative possibilities that a man might be too bad for an occupation or too good. Regression 
equations obtained by using proficiency ratings as criteria allowed account to be taken of the first 
alternative only. Furthermore, they provided no convenient method for comparing the require- 
ments of different occupations. 

How to use observations of a crude kind, such as classifications of men into those who gave 
right or wrong answers to a particular question, or into a few roughly defined grades of ability, were 
questions which constantly arose, as Dr. Vernon had already remarked. The relationship of an 
attribute with a dichotomous classification to other attributes or variables could be considered in 
various ways. To express its relation to a continuous variable in terms of a regression coefficient 
such as biserial r seemed to him artificial ; the variance ratio or the correlation ratio expressed it in 
simpler terms. If the dichotomized attribute were treated as dependent on more than one inde- 
pendent variable, the discriminant function could be found (allowance should be made for the 
relative frequency of the two alternatives). If the dichotomized attribute were treated as inclppen- 
dent, and its influence in combination with other independent variables on a dependent one were 
considered, its independent effect could be isolated by analysing the variances and covariances of 
the remaining variables within and between groups. If large numbers of dichotomized variables 
were considered simultaneously, four-point correlations could be used. In shprt, the occasions on 
which it was strictly necessary to use a biserial correlation coefficient arose in his opinion, very 
infrequently. 

When a biserial r was calculated (using Pearson’s formula) from data from which the product 
moment correlation had already been found, the biserial r was often a poor estimate of the product 
moment r. Slight irregularities in the form of distribution of the continuous variable might have 
unfortunate effects. In his experience, tetrachoric r (calculated from Pearson’s tables) generally 
gave a closer estimate, particularly with a good-sized sample, say 400 cases or over, or if YatesS 
correction for continuity was applied to the frequencies. ITiis was his defence for having used 
the authority conferred on him by the Director to bann the use of biserial r in the Directorate of 
Selection of Personnel. 



152 


[No. 2 , 


Discussion on Dr, Vernon's Paper 

His experience confirmed Dr. Vernon’s, that when data were graded on a rough scale, in inter- 
vals that could not be assumed to be equal, the most consistent results were obtained by using cen- 
troids. Mr. Slater had no experience in the use of the correction which Dr. Vernon mentioned, 
as he had used the method only when no finer groupings of the data could be obtained, /.e., when 
there had been no need to speculate on what the variance would have been if the variable had been 
measured on some other scale. 

In conclusion, he desired to confirm that every effort was made by the War Office to provide 
the facilities needed by D.S.P.’s statistical section, that no requests presenting a reasonable case for 
additional staff or equipment were ever summarily rejected, and that the Adjutant-General’s 
Statistical Department invariably extended to the section every assistance possible. 

Dr. Irwin wished to congratulate Dr. Vernon on the energy and ability he had shown in apply- 
ing statistical methods under the trying conditions of war-time. 

To have worked out 8,000 correlation coefficients, no matter how, and 1,000 factor analyses was 
in itself an achievement. But he felt that Dr. Vernon had left out one important factor which 
probably contributed more than anything to the success of his work ; that was his own flair for 
understanding the meaning of his data. Given this last, he would probably have found out much 
about the right way to select army and navy personnel if he had confined himself to simple tabula- 
tions and the working out of averages ; without it his correlation coefficients and factor analyses 
would have been of little avail. 

In reading an account of miscellaneous methods of determining correlation, one felt that 
reference should be made to the original work of Karl Pearson ; contingency, tetrachoric r, biserial 
r, and he thought also the method of centroids were all discovered by Karl Pearson originally as a sort 
of by-product of his fundamental work. Also he was the pioneer in investigating the influence of 
selection on variation and correlation. 

In the thirties he, the speaker, maintained, and had never really receded from the position, 
that coefficients of contingency, tetrachoric r, biserial r and the like should be used as sparingly as 
possible, and never as an end in themselves; because, except for very large samples, one knew 
practically nothing about their sampling distributions and could not apply tests of significance. 
He thought this was still the position. And he would like to ask this question : if all sorts of short- 
cut methods were used to work out correlation coefficients, if distributions were not normal, if 
samples were relatively small and the results were applied to factor analysis, did anyone know what 
significance to attach to the results ? 

This was not really'a criticism of Dr. Vernon because, if he, Dr. Irwin, understood his paper, 
he did not use these correlation coefficients as an end in themselves ; he had to use existing methods 
with all their imperfections, and when he had done his factor analyses or multiple regressions, 
significance tests not being available, he had to rely on common-sense only. 

For instance, the grouping of tests into two main types — verbal arithmetical and educational 
versus practical mechanical and spatial — would probably have emerged whatever technique was 
employed. 

And, after all, the comparison of the methods of selection he devised with the available objective 
criteria must have provided a check. But where there was a failure of a series of tests to correlate 
with an objective criterion, it must sometimes have been difficult to know whether the tests or the 
criterion were at fault. When he. Dr. Irwin, read that the practical marks in Morse ranged from 
75 per cent, to 100 per cent., he was reminded of a similar difficulty experienced by Mr. Farmer 
many years ago when he was trying to find tests for selecting naval gunners and the only objective 
criterion was a test in which everyone scored about 80 or 90 per cent. 

Finally he would like to end as he began by congratulating Dr. Vernon on the success of his 
work. . 

Sergeant Kinsman said that, speaking as one of the semi-skilled assistants to whom Dr. Vernon 
had referred, he agreed that there had certainly been snags in the work they had carried out, parti- 
cularly in dealing with high correlations. He did not presume to say which was right or which 
wrong, but there was, for instance, the question of centroids. They had found in a number of 
cases the centroid yielding a correlation of over 1*0, so in those particular instances at least the 
centroid method had to be cut out. They had then gone on to the graphical method, which was 
extremely handy to use and very fast, but there were two snags to it : one was with correlations of 
over 0*95, when it was impossible to interpolate at all, and the other was when there was a cell 
with zero. 

When it was impossible to use the graphs one had turned to Burt’s trigonometrical method. 
This, too, had been found to yield correlations of over i*o. Finally, coming down to Pearson’s 
tables, these could not be used if there was a zero in the cell in the bottom right-hand comer. 

Mr. Kendall thought he knew less about psychology than anybody in the room, but there 
were two types of problem which were thrown up by this kind of work which the theoretical 
statistician ought to tackle. He agreed with Dr. Irwin as to the general unreliability of the short- 
cut methods of correlation, but the plain fact was that, however unreliable they were, the psycholo- 



19461 


153 


Discussion on Dr. Vernon's Paper 

«ist had to use them. In the past psychologists had been dealing with rather small samples ; in the 
future, if this work was carried out with plenty of staff, it ought to be possible to discover something 
about the sampling variation of these coefficients by experimental means. 

A problem which was thrown up peculiarly by psychological work, and was also occurring at 
the present time in various other fields, arose when one suddenly added another variate and found , 
that all the coefficients in a regression equation altered considerably. If one had a scatter of points 
in n dimensions and fitted a number of planes to them by any of the approved techniques, it was 
possible to account for the major part of the variation in a certain lower number of dimensions. 
But some of the points would lie slightly outside those dimensions, and the question was how far 
this could be regarded as a sampling effect— was there, in fact, a sampling variability in a dimension 
number ? or, in psychological terms, how many factors must one take before the data are “ satis- 
factorily ” described? He hoped that this peculiar and interesting problem would be taken up. 

Mr. a. H. . 1 . Baines said he understood that in factor analysis Dr. Vernon employed simple 
summation followed by a group factor method. In Dr. Vernon’s hand the latter technique was 
without doubt a most powerful weapon, but in his, the speaker’s, its dependence on personal 
judgment might be dangerous. He had found that a group factor analysis might give rise to a 
communality which was impossibly high in view of the known reliability of the test concerned. 
If the underlying assumptions were varied, the impossibility might merely shift from one test of the 
battery to another. In such a case he would be inclined to make a linear transformation of the 
simple summation analysis, to bring it as nearly as possible into line with the group factor analysis. 
He was aware that such a transformation would not in general be strictly a rotation, but it would 
approximately preserve the communal ities of the simple summation analysis and avoid the paradox 
he had just mentioned. The transformed factor loadings corresponding to zeros of the group 
analysis might indicate the presence of small group factors not allowed for in that analysis. He 
hoped that this treatment was in accordance with the views of Dr. Vernon and Professor Burt. 

Dr. Vernon expressed his gratitude to the Society for the honour it had done him in inviting 
him to address the meeting, and also to Professor Burt and Dr. Fraser Roberts and others for the 
kind remarks they had made. 

Replying to some of the points raised, Dr. Vernon said that Professor Burt had contrasted the 
work of the laboratory psychologist, who can car^ out carefully designed experiments under 
fully controlled conditions, and that of the vocational psychologist in the Services who has to work 
at high speed and produce results of immediate practical utility. There are grave dangers in the 
latter approach. As one member of Personnel Selection Staff had expressed it, “ we have become 
psychologically and statistically illiterate.” If this work continues in peace-time, it is most desirable 
that Service psychologists should have more time for fundamental research and should maintain 
closer contacts with laboratory psychologists. 

Referring to Dr. Fraser Roberts’ remarks on multiple correlation. Dr. Vernon entirely agreed 
that this technique (or the discriminant function technique) is of value for choosing the best 
selection tests out of a large battery. But for the shortage of staff and lack of time. Service 
psychologists would undoubtedly have devised many more tests, and studied their predictive value 
in particular jobs. The difficulties that he had described in multiple correlation methods arose 
when a single battery of some five to ten standard tests was given to trainees in numerous jobs, 
and it was desired to determine the appropriate weightings of the component tests in the battery 
for the various jobs. While he would dispute Mr. Slater’s assumption that the average njan 
engaged in any occupation is neither too good nor too bad at his work, he had often employed 
criteria of this type in studying the relative value of tests in a battery, and had generally found the 
results to coincide closely with those obtained when other criteria, such as proficiency ratings, were 
used. 

Sergeant Kinsman’s difficulties with tetrachoric and centroid coefficients are likely to arise only 
when the populations are very small, and the distributions on the correlated variables so irregular 
that no correlational technique can satisfactorily express the agreement between them. Mr. 
Baines’s method of linear transformation of simple summation factorial analyses clearly possesses 
statistical advantages over his (Dr. Vernon’s) group factor method. But as it is guided by the 
previous group factor analysis, it too must depend rather largely on the analyst’s personal judgment. 

Dr. Irwin and Mr. Kendall commented on our ignorance of the sampling distributions of 
“short-cut” correlations. He would point out, first, that in many investigations more orthodox 
correlational, chi-squared, or analysis of variance techniques were applied, and the ordinary tests 
of significance were therefore applicable. Secondly, he would recall the essentially pragmatic 
approacH of the psychologist in the Services, It was only rarely that fine discriminations were 
needed; much more often a correlation, or other statistic (which was based on populations 
numbering several hundreds^ was so obviously significant that conclusions derived from it could be 
applied forthwith, or else it was too small to possess any practical value. 

Finally, Dr. Vernon said that there were good prospects of Service psychologists and 
statisticians being allowed to publish their work, and of similar work being continued, in a more 
leisurely and more thorough manner, during peace-time. 



154 


[No. 2, 


The Application of some Commercial Calculating Machines to Certain Statistical 

Calculations 

By H. O. Hartley 

[Read before the Research Section of the Royal Statistical Society, Jane 5th, 1946, 

^Dr. J. Wishart in the Chair.] 

Introduction 

It seems to be a popular belief that a statistician must, of necessity, be good a^ figures. This 
may be true of some statisticians, but is certainly not the rule. Many good statisticians loathe 
figures and are weak at mental arithmetic; others are good at it; others, again, are indifferent. 
Let us hear what L. McMullen says in his appreciation (13) of the great “ Student 

“ It might be supposed from the amount he did in the time, that he [Student] was unusually 
good at arithmetic and the arrangement of work; this, however, was not the case, for his 
arithmetic frequently contained minor errors. In one of his obituary notices a tendency to 
do work on the backs of envelopes in trains was mentioned, but this tendency was not 
confined to trains ; even in his office much work was done on random scraps of paper. He 
also had a great dislike of the tabulation of results and preferred to do everything from first 
principles whenever possible.” 

Later on McMullen relates an amusing incident : 

” When he handed over to me a routine calculation which he had done for many years, 
I was astonished to find that he had written out every week an almost unvarying form of 
words with different figures. To my question, ‘ Whyever don’t you get a printed form?’ 
he did not reply, ‘ Doing it from first principles every time preserves mental flexibility.’ He 
would have considered such a remark unbearably pompous. He said, ‘ Because I’m too 
lazy,’ to which I replied, ‘ Well, I’m too lazy not to.’ ” 

There are here two diametrically opposed opinions, both justified at the time when they were 
formed. When a new statistical calculation is carried out for the first time it would seem a waste 
of time to think about designing a form or to get one printed, only to find perhaps that after 
two or three applications some essential features in the data have to be altered and the form scrapped. 
Yet when one finds that the work has developed into a standard routine, a printed form will 
obviously save a great deal of labour. 

A printed form is perhaps the simplest computational aid, and the little story I have told 
about one illustrates an important point affecting all such aids and procedures. They are all 
intimately linked with details of organization ; in particular their efficiency and economy depend 
entirely on the size of the job and the frequency with which a particular calculation has to be 
performed. The research statistician must bear this constantly in mind. His task is, in this 
respect, more difficult than that of his colleague who is concerned with large-scale routine work 
of a set and unvaried character. 

The research statistician may have to carry out a new analysis not knowing whether it will 
ever be repeated. It may never occur again; it may be the first of a large number of similar 
analyses. He cannot afford to wait until such a decision is reached — indeed, the decision about 
repetition often depends on the result of the “pilot.” The varied and changeable character of 
research and development work therefore necessitates a flexible and adaptable organization of 
computing methods. It is clear, therefore, that it is impossible to advocate a “ best computing 
method ” for any particular statistical calculation. If somebody were to ask, “ What is the best 
way of calculating a correlation coefficient?”, I am afraid the answer would be the Joadish, “ It 
all depends what you mean by correlations ! ” Is it a single correlation coefficient calculated 
from (say) 30 pairs of two-figure observations? If so, is this calculation to be repeated daily, 
weekly, monthly or annually? Are the 30 observations multivariate sets, having each 5 variables, 
10, 20 or even 100 varieties, and are all intercorrelations required ? Is it only a selection of inter- 
correlations that is required? In which form are the data recorded? Are they scattered in 
note-books, are they tabulated on manuscript sheets, or have they been punched on Hollerith 



1946J Commercial Calculating Machines to Certain Statistical Calculations 155 

cards? All this affects the choice of method, the equipment to be used and, indeed, the time to be 
spent on planning the computational method to be employed. It is obviously foolish to spend 
sC day planning a more efficient way of doing a particular calculation only to find that the work 
could have been completed by an existing inefficient method in a fraction of that time. 

We must therefore at the outset disappoint the reader who expects a clear-cut answer as to the 
best computational aid and best method for any statistical calculation in all circumstances, and 
will state immediately the very limited programme of this paper. Only commercial calculating 
machines are dealt with (leaving out, therefore, machines or gadgets specially built in small numbers 
for statistical calculations). The use of these machines is illustrated in Part I by selecting one 
important type of statistical calculation — multi- variable analysis. This problem has been selected 
because here heavy computing occurs fairly frequently. The analysis of variance is included as 
a special case (single variable). 

Adding and listing machines, calculating machines and Hollerith equipment are described from 
the point of view of their uses in multi-variable analysis. A description of some of their functions 
only is given, without embarking on technical details. The methods of Part 1 are those believed 
to be most suited to these machines in the light of several years of computing experience at Scien- 
tific Computing Service, Ltd. They are, as it were, a sample drawn from the “ bag of tricks 
of this organization, and grateful acknowledgment is made to Dr. L. J. Comrie and my former 
colleague G. B. Hey for having contributed in discussion many of the tricks in the bag. The 
sample drawn is by no means random, but is biased by my personal choice and preference as to 
which Jack-in-the-box should come out of the box — whether Hollerith's “ Punch,” Brunsviga’s 
” Judy ” or others ! In particular, 1 must take the blame for having somewhat distorted an un- 
published note that Mr. Hey prepared on the application of Hollerith equipment to multi- variable 
work on which section (iv) (b) is based. 

Little reference is made to publications by American authors on Hollerith methods, partly 
because British tabulators differ considerably from American, and partly because in the United 
States Hollerith equipment appears to be more readily available at little or no cost for scientific 
calculations (10), and is therefore often used in circumstances in which its use in this country 
cannot be afforded. A good survey of American applications is given in (10) and (1). 

Because of the widely varying circumstances as to availability of equipment, no general com- 
parisons of economy of computing aids are made; such statements as arc made are based on 
the assumption that Hollerith equipment would have to be specially hired for the purpose of 
doing the particular calculation. 

In Part 11 we give a miscellaneous special selection of important statistical calculations. 
Reference is made to the functions of calculating machines described in Part 1. The mechanical 
methods described are, to the best of my knowledge, new. Only elementary mathematics is used 
throughout. 

Part I. Multi-variable work mechanized 

(i) The problem 

The data consist of a number of observations, each comprising several (say k) variables. For 
example, the observations may consist of a sample of machine parts made from steel, and the 
variables observed for each individual part may relate to a series of hardness tests and/or deter- 
minations of the chemical composition of the steel from which each part was made; or the 
observations may consist of a group of school-children, and the variables represent different 
ability and intelligence tests carried out with each child ; or, again, the observations may corre- 
spond to a series of years, and the variables may be given by the yield per acre of different crops 
and/or meteorological data appertaining to each year. 

Such data may be set out formally as shown in Schedule (i) below : 

Observation number Variable number 



1 

2 

3 

4 . . 

... A: 

1 

axx 

Oti 


«41 

0*1 

2 

ai% 

On 

«81 


Ctkt 

% 

N 

OxV 

Qts 

a%s 

OiS 

OkN 


Schedule (i) 



156 


[No. 2, 


Hartley— Application of some 

The computational task is usually to calculate all the possible — 1) intercorrelations 
between every pair of variables. The bulk of this work consists in forming all possible sums of 
products obtained by pairing two variables. This involves forming line by line the products of 
these two variables and finally adding the products thus formed for all lines. Mathematically 
speaking the task is to calculate 

S{j " OiiOji -|- at20j2 -f . . « + OfjfOjy 

for all possible pairs of variable indices /J. These quantities should normally include the k sums 
of squares 

Su-^iaay~\-(ai,y h . . -^(ai^y 

This calculation is the fundamental and also the heaviest part of all multi-variable work. 
Indeed, in certain least squares problems the above raw sums of products Sij represent the sym- 
metrical matrix of the normal equations to be solved, and no modifications are needed. For 
multiple regression work they would first have to be converted into sums of squares and products 
of deviations from the mean by 

^ NSij — TiTj where T, -f- H . . . + 

Finally, for the calculation of intercorrelations we must fc rm 

= {NSi){NsnNs^)-K 

The work involved in these final stages is comparatively light, and will not be considered 
further. If the number of variables is large, or even moderate, the work of solving the simul- 
taneous equations with matrix (5<j) or matrix (5,j) is, of course, not negligible, and is discussed 
in Part 11 fiv). 

(ii) Digital accuracy 

• Without much loss of generality it may be assumed that the values air are one-, two- or three- 
digit quantities. Where the original data are given to more significant figures we may subtract 
from all observed quantities in the same column (/.p., same variable) a suitably chosen working 
mean. If the resulting deviations still exceed three figures it will usually be found that the original 
data were recorded to a spurious accuracy. For the subsequent calculations on machines it is 
advantageous to choose working means smaller than the lowest value of each variable, so that 
all deviations are positive ; this is better than making them as small as possible. It is then more 
appropriate to speak of a working origin or, more colloquially, of a dropped constant. With 
these changes (including changes of scale) it may be assumed without loss of generality that all 
aif entering into the calculation of the Stj lie between o and 999. The return to the original 
units is either unnecessary (correlations) or trivial (means, standard deviations or regression 
coefficients). 

An example of the above treatment is given in Tables 1 and II. Table 1 gives length and 
thickness (in inches) for 5 logs, while Table II gives their deviations (in 10 thous.) from a working 
origin of 45" and 68" respectively. 


Table I 

Length, ins. Thickness, ins. 

Table TI 

Length (10 thous.) Thickness (10 thous.) 

46-78 

68-77 

178 

77 

48-73 

68-99 

373 

99 

45-31 

71-34 

31 

334 

48-43 

72-33 

343 

433 

49-77 

69-09 

477 

109 

239-02 

350-52 * 

1402 

1052 * 


With biological data the choice is usually between a convenient working origin with three- 
figure deviations and a less convenient working origin with two-figure deviations. With industrial 
data, where the variations are frequently less than 5 per mille, it often suffices to ignore one or 
two constant leading figures. This has the advantage of rendering the listing of deviations 
unnecessary. 

The data supplied to the' statistician will hardly ever be in this ideal form of positive deviations 
from a working origin. Indeed, in many cases they will not even be in the form of Schedule (i). 
Where many variables are to be correlated these will often be recorded by different observers ; 



,1946] Commercial Calculating Machines to Certain Statistical Calculations 157 

they may be scattered in various note-books or on various forms, and will have to be assembled. 
Wherever possible the work of collating the data and of converting them into deviations from 
working origins should be avoided by providing the observer, at the outset, with a suitable record- 
ing form and instructions for its use. There are, naturally, often practical limitations to this 
procedure. As a rule, therefore, this task becomes part of the computation. 

(iii) The use of an adding and listing machine and of a calculating machine in mufti-variable work 

The teaming of the above two machines will be found very helpful — in fact, they represent an 
attractive combination for general work in a statistical laboratory. 

(a) Adding and listing machines and their use in multi-variable work 

For a good general description of these machines we may refer to (6). Here we will only 
re-state some of their features. 

There are two types of machines, differing principally in the method of setting numbers. On 
full-keyboard machines there is a column of keys numbered i to 9 representing units, a second 
column (to the left) representing tens, and so on. Capacities range up to 8 columns (/.f'., 9999 
9999) or 10 columns {i.e., 99999 99999) or more. Figures are set by depressing the appropriate 
keys in the required columns. On ten-key machines there are only ten keys corresponding to 
the digits 0, i, 2 ... 9. The keys corresponding to the amount to be entered are struck in 
succession, as on a typewriter. On either type pressing the motor bar after a setting causes the 
amount set to be printed on a paper tape and also added to (or subtracted from) the amount 
already stored in the adding mechanism or register of the machine. When all figures to be added 
have been thus entered, depressing a total key prints the total and clears or zeroizes the register. 
A similar sub-total key prints as before, but retains the total in the register. With a full-key- 
board machine (see the Victor in Fig. 1) the large capacity can often be utilized for adding paired 
observations simultaneously. A cipher split (which usually splits a lo-column keyboard in the 
middle, but can be specified in any position) permits the printing of two separate series of observa- 
tions side by side (sec table of length and thickness in Table II). The vertical spacing of the 
printed lines is, of course, uniform. 

We now turn to the use of such machines in multi- variable work. The several variables are 
added (in order of observation number) from the various note-books and forms. In this process 
the operator does not set the original observations, but their deviations (formed mentally), thus 
obtaining a printed record of the deviations and their totals. In Table I we give original length 
and thickness records, and in Table II a reproduction of the tape of the adding machine. This 
work may often be checked by comparing the printed totals with those of the original variables 
if given in the note-book. If they are not, they must be produced by the operator in a separate 
run. In the event of disagreement, the tapes are a great help in finding the error. 

When all the tapes have been checked they are assembled and pasted up in the form of 
Schedule (i). This paste-up may be folded in harmonica fashion, so that any two columns may 
be brought into juxtaposition for the subsequent formation of sums of products. Alternatively, 
each tape may be left separate, and need only be reinforced so that the tapes for any two variables 
may be pinned together. If N exceeds 30, the deviations should be split (by horizontal dividing 
lines) into sections, each with its own total. If k is large, the sections should not exceed 20 in 
length or even 10 (see 5-check in (iii) (b)). 

The above procedure is also convenient if the original data are recorded in some non-decimal 
system unsuitable for the subsequent formation of sums of squares and products (such as lbs. 
and ozs., stones and lbs., shillings and pence). In Table III is given a sample of height and weight 
records of schoolboys recorded, in feet and inches and stones and lbs. 



St. 

Table III 

Weight 

lb. 

ft. 

Height 

ins. 

Scholar 1 

3 

12 

4 

M 

„ 2 

4 

1 

4 

11-4 

„ 3 

4 

11 

4 

5-7 

M 4 

3 

8 

3 

11-8 

,, 5 

5 

6 

4 

81 


19 

38 ♦ 

19 

38- 1 



158 Hartley— 7%^ Application of seme [No. 2, 

Two small auxiliary tables would be prepared giving the conversion of feet and inches straight 
into inches with a working origin of integer feet omitted. A convenient working origin would 
be 3 feet, and a section of such a conversion table is given below : 


Ft. 

Ins. 

Set 

4 

9 

21 

4 

10 

22 

4 

n 

23 

5 

— 

24 

5 

1 

25 


This would be placed (or pinned) in front of the operator so that the original height entries are 
quickly converted into the deviation of inches from the working origin set on the machine and 
added. In forming the check totals on the original data sheets, inches and feet are, of course, 
added separately, and the conversion is for the feet total only. The work, this time, should be 
carried out in two separate adding runs for heights and weights, as simultaneous use of two con- 
version tables is confusing. The tapes showing deviations from the working means are shown 
in Table IV. 


Table IV 


Scholar 1 

Weight 

lb. 

12 

Scholar 1 

Height 
I'f. ins. 
131 

„ 2 

15 

M 2 

234 

» 3 

25 

3 

177 

M 4 

8 

4 

118 

„ 5 

34 

« 5 

201 


94 * 


861 * 


Thus the adding and listing machine perfoims three tasks in one operation : (1 ) recording and 
checking deviations from working origins, (?.) forming and checking totals of these deviations; 
(3) collating of variables in a suitable working schedule. 

(b) Calculating machines and their use in multi-variable work 

We now turn to the actual formation of the sums of products A great variety of calculat- 
ing machines are available for this work. For a brief description of these we may refer to (6), 
and for further details to trade literature. We will, however, briefly describe here how sums 
of products may be formed on these machines. 

The process of multiplication is really one of continued addition. If we wish to multiply 
(say) 347 by 4, we could set 347 and add it four times, thus obtaining 1388. To cater for multi- 
plication by numbers of more than one digit the adding mechanism is mounted in a movable 
carriage so that it can be stepped or moved into the correct position in sympathy with the units, 
tens, etc., position of the multiplier. Thus to multiply 347 by 42 we first step the carriage to 
the tens position and add 347 four times, showing the partial product 13880 in the product register; 
we then step to the units position and add twice to obtain the final answer 347 x 42 = 14574. 

On most machines the multiplicand (347 in the above example) is set on a full keyboard {e,g., 
Friden, Madas, Marchant, Mercedes and Monroe); on others it is set by moving levers into the 
appropriate digital positions (e.g., Britannic and Bruns viga) ; on others, again, it is set by means 
of a lo-key keyboard (e.g., Facit). 

On hand-operated machines the continued adding is done by turning the handle of a crank, 
making revolutions equal in number to the digits of the multiplier (four, then two, in the example) 
and stepping the carriage with the other hand (Brunsviga and Facit). On electric machines the 
revolutions of the crank (and often the stepping) are actuated by an electric motor. This motor 
is controlled cither by the operator through a motor bar (semi-automatic multiplication), or by 
relays (fully automatic multiplication). The setting of these relays is achieved by depressing the 
digits of the multiplier on a full keyboard (Monroe, Madas and Mercedes) or by striking in 
succession the keys of a ten- key keyboard (Marchant and Muldivo). 

After the multiplier has been conveyed to the machine it is shown in the multiplier register 
(M.R.), whilst the product is shown in the product register (P.R.). The multiplicand is still set 
on the keyboard (or its equivalent). Some models provide a capacity of 10 figures in multi- 
plicand and multiplier and 20 figures in the product ; thisisexpressedasacapacity of 10 x 10 x 20. 



1946 ] Commercial Calculating Machines to Certain Statistical Calculations 159 

Machines of capacity 8 x 8 x 16 are also popular. Special features of some machines are de- 
scribed in (6), while Figs. 2 and 3 illustrate two. 

A sum of products is simply formed by setting in succession the individual multiplicands and 
conveying the corresponding multipliers. The P.R. will then accumulate progressively the 
required sum of products. 

The M.R. can accumulate the sum of the multipliers. However, it is desirable (and with 
semi-automatic multiplication necessary) to make the M.R. show individual multipliers in order 
to verify that the correct multiplier has been conveyed. It must therefore be cleared after each 
multiplication. We can, however, produce the sum of the multipliers in the P.R. to the left of 
the sum of products. To this end, on full keyboard machines, we set i in the extreme left-hand 
column of the keyboard. 

Below is shown an example of a sum of two products, 516 x 312 + 728 x 479, formed in this 
way; 

Keyboard M.R. P.R. 

1 0000 516 312 312 0 160 992 

1 0000 728V 479 791 0 509 704 

The sum of products is 509 704 and the sum of the multipliers 791. The latter should be identical 
with the total of the variables formed in {a\ and thereby affords a check on multipliers (apart 
from compensating errors). Care must be taken that the sum of products does not spill over 
into the sum of the multipliers. 

If a similar check is required for the setting of the multiplicands we may (on certain machines) 
Increase the multiplier also by i in the same * relative position on the left. The P.R. will then 
show, suitably separated, the number {N) of products, the combined sum of multiplicands and 
multipliers (in the middle) and the sum of products (on the right). 

The previous example, redone by the present method, becomes : 

Keyboard M.R. P.R. 

1 0000 516 1 0000 312 1 000 0828 0 160 992 

I 0000 728 I 0000 479 2 000 2035 0 509 704 

It will be seen that with three-figure multiplicands and multipliers a machine of a capacity 
8 X 8 X 16is justadequateforthistypeof calculation, although the largercapacity of 10 X 10 x 20 
is preferable. 

Although the above procedure provides a check, it usually slows up the operation, while on 
lo-kcy machines like the Facit it is prohibitive because all the interspersed ciphers have to be 
struck. Modifications are therefore required to achieve full efficiency on each particular machine. 
With the Facit, for instance (see, e,g.y (7)), we might check the sum of multiplicands only or forgo 
refinements and accumulate the sum of multipliers in the P.R. The first-described method of 
checking the sum of multipliers only is often preferable (e.g., fully automatic Marchant and semi- 
automatic machines). In all these cases other fool-proof checks must, of course, be applied; 
indeed, even if the full check on both multipliers and multiplicands is applied, the copying of the 
sum of products from the P.R. still remains unchecked. 

It is therefore essential to superimpose on the above current checks complete overall checks. 
In the case of multi-variable work the so-called 5-check is normally used. With this well-known 
device the arithmetical sum of all the variables in each line of the schedule is formed and treated 
as an additional variable. 

For convenience in checking we usually set out a triangular schedule as shown : 

y\i 1 2 3 4 ... A: 5 


1 Sii 5n 5 i4 Sik Si 

2’ 5gg ^23 524 S 2 

3 538 *^*4 


k Skk 5 * 

5 5 

Schedule (ii) 


♦ This is, of course, not necessary but convenient. 



150 


l^vrwl*^^ isf tmkg 


' f^X 


Checks are now made in accordance with the identity 

Sfi’h Si% + . . . + (0 

For instance, for / ~ 2 the quantities underlined in full in Schedule (ii) should be equal to Sz» 

A great advantage of this check is that it helps to find any errors. Suppose, for instance, that 
the calculated 5,^ satisfy the above identities for all i except i — 2 and i = 3, where (1) is invali- 
dated by the same amount. It is then almost certain that the point of intersection of 

the full and broken lines in Schedule (ii) — is in error by this same amount ; it should therefore be 
re-done to confirm this hypothesis. If the identity is not satisfied for i == 3 only, the sum of 
"^squares S^z is the culprit. With a large number of discrepancies this process of error-finding 
may degenerate into a time-consuming cross-word puzzle. This indicates that the 5-check is being 
abused ; it is being asked to check too much work ! It is therefore advisable, if N is large, to 
record separate sums of products for each of the sections of about 20 observations for which 
check totals were already formed in {a) and to apply the 5-check to each section. The extra 
work of copying such sectional sums of products will be found to pay dividends at the 5-chcck 
stage. Also if each section is first completely checked by its own 5-check the adding of the 
sectional 5/^ is in turn checked by the adding of the 5 values. Indeed, this is an additional safe- 
guard that all errors found by the sectional 5-checks have in fact been corrected. 

In certain cases it will be advisable to split the variables into sections, each to be checked by 
its own 5-check. This procedure is useful for very large A, or when not all possible intercorrela- 
tions are wanted, but only certain rectangular or triangular arrays of the triangular Schedule (ii). 
'The splitting into sections should then, of course, be chosen to coincide with vertical boundaries 
of these arrays. 

If the 5-check is carried out in small sections it may be advisable to dispense with some or all 
of the current checks on the sums of multipliers and multiplicands. More errors (if made) must 
then be found by the 5-check, but the formation of individual sums of products is speeded up. 
For instance, by sacrificing the check on the sum of multiplicands, two sums of products can 
often be formed simultaneously on machines with a lo-column keyboard by setting two multi- 
plicands, one on the left-hand three columns, and another on the right-hand three columns. 

For positive multipliers and multiplicands the fully automatic Marchant is probably the 
fastest machine for this operation. 

A word should be said about the special case of a correlation between two variables x and y. 
The five required answers 

, liX '^y Sjc* '^xy Sy* 

can be produced in one operation by setting x on the left, y on the right and squaring (lO'^jc f >'). 
The P.R. then shows, in separate sections, ^x\ ^2xy and while the M.R., if not cleared after 
each multiplication, shows Hat on the left and '^y on the right. This operation is particularly 
convenient on the Monroe AA-1 or the Madas, as on these machines the multiplicand and the 
multiplier are set and conveyed from the same keys, so that the quantity to be squared is com- 
municated once only to the machine. This method is described by Dwyer (9) but has been 
practised by users of the above machines in this country for a long time. 

A neat application of this method is the sum of squares of quantities in a non-decimal system 
without conversion. Thus for squaring the heights given in Table 111 directly we would proceed 
on the Monroe or Madas as follows : 


Set on keyboard and square 

4 000 on 
4 000 114 
4 000 057 

3 000 118 

4 000 081 


P.R. 

16 000 088 000 121 
32 001 000 013 117 
48 001 456 016 366 
57 002 164 030 290 
73 002 812 036 851 


At the end of the run we would have in the M.R. 19 000 381, which should agree with the totals 
in Table III previously obtained, thereby checking the setting. The answer for the sums of 
squares in (^ ins.)* is then obtained from the quantities in the P.R. thus : 

144 00 X 73 

4- 12 0 X 2 812 

+ 1 X 36 851 


1 425 491 




{See p. 159. 


“HOLLERITH" 120 



Fk,. 3. 


[Sec p. 1 59. 


A B 


0OOOOOOOOO|OOOO| 

2 


33J3333333 

4 

5 

6 fi 6 6 6 6 6 6 S 6j 
1 

I 


■jrouii Coaci 


0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0 0 0 0 0 0 0 0 0|0 0 0 0 0 0 D 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 B 

1 
2 

|| 3 3 3|3 I 3 3|3 I 3 sjs 3 I 3 3 3 3 ajs 3 I 3 3 1 3 3 I 3 3 sjs 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 

I II 4 

I I I I I I I 5 

6 E e 6|6 6 I 6{| 6 6 6 6 I 6 8 6 6 I 6 6 I 6 6 I 8 6 I 6 6 6 6 6 6 6 8 6 6 6 6 6 6 6 6 6 6 E 6 6 6 6 e 6 6 G 6 6 e G E G 6 C 6 6 6 6 8 6 6 

7 

8 

9 9 B 9 9 9 9 9 9 bIb 9 I sIb 9 9 bIb 9 9 9 I 9 9 S iIb 9 9 ||9 9 9 9 9 9 9 9 9 9 9 bIb 3 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 S 9 9 9 9 9 9 9 9 9 S 9 9 9 9 9 9 9 


Fig. 4. 


[See p. \6\. 



hii,. 5. 


l.S(r/J. 161. 



FUl. 6. 


/?. 161 . 







The number of products to be formed in intercorrplation work is iNk(k 4* 1), and with the 
3^-check included this figure goes up to iMA: 4- l)(Af 4* 2). The work is therefore proportional 
to the sample size (N), and increases as the square of the number of variables. For 20 variables 
with 300 observations the number of products is already in the neighbourhood of 60,000, while 
for 50 variables with 1,000 observations it is well over a million. The latter type of requirement 
is unusual, but the former scale of work is quite common. When work of this magnitude is 
contemplated it becomes necessary to use machines more powerful than ordinary desk calculators ; 
such help is admirably rendered by the punched-card system. 


<iv) The use of Hollerith punched-card equipment for multi-variable work 

Before describing the method on Hollerith equipment we should recall that there is another 
punched-card system — the Powers-Samas. Whilst for commercial and routine statistical work 
(for which these systems were really designed) both have their advantages and disadvantages, the 
flexibility offered by the Hollerith plugboard, as opposed to the Powers-Samas connection box, 
is a great advantage for scientific work, where the nature and details of the calculations alter 
frequently. For a detailed description of both systems we may refer to the excellent account 
given in (3). 

(a) A description of a Hollerith equipment and some of its functions 

Since the appropriate Hollerith equipment is well described in (3) and a brief account is also 
given in a paper submitted to this Society (5), wc may confine ourselves here to a brief outline of 
some of its functions (but not of the technical detail). 

Cards, Data can be recorded on cards by punching holes in them. These holes are sensed 
electrically by the various machines through which the cards are passed. The card (see Fig. 4) 
is 78 by 3 i inches and contains 80 vertical columns, each capable of recording one of the digits 
o, I, 2 ... 9 ; there are also two extra positions at the head of each column, called X and F, 
which have special uses (see, e,g., (b)). A group of several columns is called a field. Thus in 
a four-column field we may record any four-figure number such as 3192 by punching in its four 
columns from left to right 3, i, 9 and 2 respectively (see field A in Fig. 4). Cards for special 
purposes can be printed with the subdivision of the columns into fields clearly indicated. 

Mechanical Key Punch. Cards are punched in the mechanical key punch column by column. 
A somewhat similar machine is used to verify the punching. The work is done by junior operators, 
who with some experience will average 150 cards per hour. This is a very good speed if all 
80 columns are punched, but is of course exceeded if fewer columns are punched. 

The Sorter (see Fig. 5). The function of the sorter is to separate the cards into groups accord- 
ing to the holes punched in any single selected column. The cards are deposited in one or other 
of 12 boxes, according to whether T, A', 0, i ... 9 is punched in that column ; cards with no 
holes punched in the sort-column fall into a 13th box. The machine may obviously be used to 
sort cards into numerical order according to the amounts punched in any field. To do this, 
sorting is done first on the units, then on the tens, and so on. The theoretical speed of this machine 
is 24,000 cards an hour. 

Rolling Total Tabulator (Fig. 6). This machine senses electrically the holes punched in each 
column of the card by means of two sets of 80 brushes. It contains six adding registers or counters of 
zx -figure capacity each; numbers can be added from any column of the card into any wheel\of 
any counter or into several wheels simultaneously. These numbers can, if desired, be printed in 
any position of a print bank of 80 type-bars. Instructions to the machine are conveyed through 
a plugboard. 

The normal operation is to adddhe numbers punched in certain fields (such as >4 to £ in 
Fig. 4) of a group of cards into certain counters to form the field totals. The tabulator can be 
plugged to sense the end of such a group of cards automatically (technically known as breaking 
pf control). The machine then normally prints the totals in all counters and clears the counters 
before beginning to add the fields from the next group of cards. Alternatively, it can be arranged 
that at the end of each group certain counters add (or subtract) their totals into any combination 
of the other counters in a prearranged sequence of operations. „ As many as eight such operations, 
or cycles as they are styled, can be performed at the end of each group. 

Simpler tabulators, lacking these transfer facilities, are available at lower rentals. 



162 Hartley— Tte Application of some [No. 2, 

The fastest models take 0-4 second per card passage, but if the contents of each card is to be 
printed the time is o*6 second per card. 

The Reproducing Punch, We shall describe this machine in a little more detail and include 
some non-standard uses. 

This machine reproduces the punching in any selected columns of one series of cards in any 
selected columns of another series of cards — an operation known as reproducing. 

The other fundamental operation is to copy-punch information from one particular card (master- 
card) on to all cards in a group (detail cards) ; the master-card is placed in front of the detail 
cards, and the group passed through the machine in what is called a plain gang-punch operation. 
This can be done with several master-cards, each followed by its associated pack of detail cards; 
all the cards are passed through without interruption at the end of the groups. Actually in this 
operation there are at any time two consecutive cards in the machine, which is sensing the holes 
in the first card ; the currents that pass through these holes are plugged to operate knives that 
punch the same digits in the corresponding positions of the second or following card. At the 
next cycle the first card is passed out to a stacker, while the second card (having just received 
its holes from the first) is passed under the sensing brushes and read in order to punch the same 
information on the third card. When the next master-card passes under the knives a special 
impulse automatically suppresses the punching for one cycle ; to this end master-cards, but not 
the detail cards, must have an X punched in a certain column, or vice versa. 

An important (non-standard) variation of gang-punching is offset gang-punching — ^a technique 
that we shall apply later on to serial correlations. Imagine that we plug from the sensing brushes 
of colunm 1 to the magnet operating the knives of column 2, from the sensing of column 2 to the 
punching of column 3, and so on. This plugging may be schematically represented as shown 
in Schedule (iii). 

Sensing columns 1 2 3 4^ 79 80 

Punching columns 1^2^^4 ^79^0 

Schedule (iii) 

Imagine also that a hole (say a T) is punched in column 1 of the first card and that all the 
remaining cards have no holes punched. The effect of the consecutive sensing and punching will 
then be that the second card will be punched Y in the second column, the third card in the third 
column, the eightieth card in the eightieth column, and all subsequent cards (if any) will have 
no holes punched. 

Suppose now that all the columns in the first card are punched, and denote by (/) the digit 
punched in the /th column of the first card. Then the effect of offset gang-punching will be as 
shown in Schedule (iv). 


Column 1 

Card No. 

2 

3 

4 

5 

79 

80 

1 (1) 

(2) 

(3) 

(4) 

(5) . . 

• . (79) 

(80) 

2 

(1) 

(2) 

(3) 

(4) . . 

. . (78) 

(79) 

3 


(1) 

(2) 

(3) . . 

• . (77) 

(78) 


79 ( 1 ) ( 2 ) 

80 (1) 

Schedule (iv) 

If the (/) are single-digit values of a time series, we see that after the cards have been offset 
gang-punched columns 79 and 80 contain pairs of consecutive observations viz. (79) and (80), (78) 
and (79), ... (1) and (2). Columns 78 and 80 contain pairs of observations two time intervals 
apart, columns 77 and 80 contain observations three time intervals apart, and so on. 

A useful modification is to split the offset gang-punching into a series of independent fields. 
Suppose the (/) represent eight separate zo-decimal quantities punched in columns 1-10, 1 1-20, . . . 
71-80 respectively, and that we omit nine plugs as shown in Schedule (v). 


Sensing columns 
Punching columns 


Inn"" 

1 2 3 


9 10 11 12 13v 
\' \ \ ^ 
^^9 10 H 12 13 


19 20 79 80 

^ 19^0 ‘ ’ ' ^^ 79^0 


Schedule (v) 



19461 Commercial Calculating Machines to Certain Statistical Calculations 163 


The resultant punching is shown in Schedule (vi). 


Column 1 
Card No. 

2 

3 

4 . . 

. . 9 

10 

11 

12 

13 

14 . . 

. . 19 

20 . 

. . 79 

80 

1 1 

2 

3 

4 . . 

. . 9 

10 

11 

12 

13 

14 . . 

. . 19 

20 . , 

. . 79 

80 

2 

1 

2 

3 . . 

. . 8 

9 


11 

12 

13 . . 

. . 18 

19 . 

. . 78 

79 

3 


1 

2 . . 

. . 7 

8 



11 

12 . . 

. . 17 

18 . 

. . 77 

78 

9 




1 

2 





11 

12 

71 

72 

10 





1 






11 


71 


Schedule (vi) 


We have thus copy-punched fields and at the same time divided the original quantities pro- 
gressively by powers of lo, but without any rounding off. This technique is of great help when 
making what may be called ready-reckoner cards* and will also be applied later. A mathematician 
would undoubtedly call both the above applications trivial, but nevertheless they are useful and 
efficient. The reason for this is that the machine works on practically all columns of the cards 
at the same time. 

Still another use of the reproducer punch is that it can be connected to a tabulator, and at 
every break of control punch on to new cards the totals accumulated in any or all counters. When 
used in this way it is known as a summary punch, and the cards it punches are described 
as summary cards. The reproducer handles each card in o*6 second, although when summary 
punching it adds 2*2 seconds per summary card to the normal time of the tabulator. ? 

The Multiplying Punch. This member of the Hollerith family will sense up to four fields (say 
A, By C, D) not exceeding eight digits on a card, form certain combinations like AB + C ± D, and 
punch the result on unused columns of the same card. Although it can form products at from 700 
to 1,200 an hour, according to the number of digits in the multiplier, its somewhat high rental has 
not encouraged its use in scientific and statistical work, except in certain particularly favourable 
circumstances. For this reason no applications are cited in this paper. 

The Collator. This youngest child of the family feeds two separate stacks of cards through two 
field units, and has four boxes or receptacles in which cards may be deposited. Instructions are, 
as usual, conveyed by a plugboard. Its simplest application is to selecting cards having a certain 
number, or those above or below that number, from a stack of cards. The discriminating number 
would ^ on a single card that would remain at its sensing station throughout, and would direct 
the cards of the main stack into which box they should go. Similarly, two packs of cards, each in 
order, could be fed through the two feeding units, and would emerge in order as a single pack. 
As a variant, cards in pack A that found no mates in pack B could be directed to one box, and 
conversely, and all mated cards to a third box. In short, all kinds of merging and separating tricks 
can be done with this machine. Its comparative newness and lack of accessibility have not 
enabled it to be used much so far outside large commercial installations, but it has interesting 
scientific possibilities. 


(b) The calculation of sums of products on Hollerith equipment 

In spite of the existence of the tpultiplying punch, it is usually more economical to use the 
tabulator to form sums of products for N > 30. 

Generally speaking, the most elficient method is that known here as summary multiplication, 
or in America as progressive digiting. It has been widely used, both here and in the U.S.A. Good 
descriptions are given in (5) and (2). Nevertheless, for completeness sake, let us recall this method 
here briefly. 

Suppose we have cards with a single-digit variable a in column 1 and a series of corresponding 
variables b <not restricted in size) in other columns of the same cards. The problem is to. form 
Tiob. If we sort these cards on column 1 into 10 packets corresponding to u 9, 8, . . .,1, 0, 
we may then add the field 6 in a tabulator and print totals B of groups as follows : 



154 Hartley — The Application of some [No. 2, 

for == 9 
jBg = for a = 8 

Bi = 'Lb for = 1 
Bq — for = 0 

Obviously 

Lab = 9 -f- 8 Bg + • • . + -^1 

- + (fig + ^g) + . . . 1- (fig + Bg f . . . -f fil) 

The second form shows that if we feed the packs in descending order of a and omit the pack 
in which a -- 0, and do not clear the counter after each pack, we shall print the progressive 
totals fig, fig 4 * fig • • •> which need only to be added to give Lab. If we summary punch the pro- 
gressive totals, a run of the summary cards through the tabulator will form the desired sum of 
products. 

Now, this looks like using a steam-hammer to crack a nut ! It would indeed be wasteful 
were it not for the fact that the large capacity of the tabulator permits further variables, c, d, e, 
etc., punched in other fields of the same pack, to be added at the same time as the b\. Their 
progressive totals are included in the same summaVy cards, and in the same sub^quent tabulation 
run, without adding a second to the machine time. The golden rule for getting a high efficiency 
factor in summary multiplication is to make the fullest possible use of the counter capacity of 
the tabulator. 

Let us now discuss briefly further details of the method : — 

(A) If the variable a is (say) a two-digit number, we repeat the above process, sorting on the 
tens column of a and adding the variables b, c, ^4 . . . as before. We may summary punch these 
progressive totals one column to the left, so that they may be added with those of the first run to 
form the complete sum of products. 

(B) Sums of squares are, of course, merely the special case when the variable on which the 
cards are sorted is also among the variables to be added. 

(C) For moderate sample sizes (N --- 30) some of the digits i, 2 ... 9 may not occur in a. 
Thus if there were no card with a — 1^ the sum of the progressive totals 


fig* ^9 + -^ 8 * ^9 4 - fig 4 “ fig* . . ., fig 4 " fig 4 ' fig 4 ’ . . . 4 * fil 


would not be equal Lab, but would be in defect by fig 4 B^. 

The required correction becomes more complicated when there are several gaps, or even runs 
of gaps. Although there are various ways of allowing for these gaps, it is better to avoid them by 
punching several red space cards with T in odd columns and X in even columns, and several blue 
cards with X in odd columns and T in even columns. After sorting the cards on a digit of a, put 
space cards in all empty boxes, making sure that space cards in adjacent boxes are of different 
colours. These dummy multipliers (X or V) will break control, and thus cause the desired missing 
progressive totals to be punched. On the other hand, the holes X and K in the multiplicand 
fields do not affect the addition. 

The former method of using blank space cards (5), (2) is no longer used, as with the type of 
control fitted to-day two consecutive blank cards would break control once only. It is obvious 
that no gaps above the highest digit occurring need be filled. During the next sort all space cards 
will naturally fall into the X and Y boxes, whence they will be returned to their red and blue 
packs. 

(D) A single ii-whcel counter will hold the progressive totals of two 3-figure variables, 
provided that the sum of the variables does not exceed 10®. With 2-figure variables it will 
rarely be possible to add three variables in a counter, particularly as the sum of the variables 
will usually have one or two digits more than the variables themselves. In statistical work, there- 
fore, any variation from two variables to a counter will be rare. Hence with a six-counter tabulator 
maximum efficiency is secured if we have 1 1 variables and an S. 

On a tabulator specially designed by G. B. Hey, and now at the National Physical Laboratory, 
the left-hand wheel of any counter can be joined to the right-hand wheel of any other ; this opens 
up possibilities for squeezing still more on to the machine. 



1946] Commercial Calculating Machines to Certain Statistical Calculations 165 

With (say) x x variables and their S', which we treat as a variable, we must sort on each variable 
in turn in order to obtain all sums of squares. We cannot, therefore, do better than add all the X2 
variables every time. We thus form 'Lab by adding the /)’s after sorting on a, and ^ha by adding the 
fl’s after sorting on 6, and so on. The addition of the summary cards would then form and print 
the X2 lines of Schedule (ii), completed symmetrically as a full-square pattern, the 6o sums of 
products would each be produced twice and the X2 sums of squares once. This duplication 
takes no longer, and has the advantage of providing a good check on the correct working of most 
of the machine operations. If there are more than xi variables, it is of course often possible to 
avoid producing every sum of products twice over; ingenious schemes have been worked out 
by G. B. Hey in which a good compromise is reached between convenience of plugging the fields 
to be added and the arrangement in which the answers are printed. 

(E) What happens if the number of variables is less than xi ? Are we doomed to inefficiency? 
In the first place, Hollerith tabulators having four counters only are available at a lower rental. 
Sometimes, however, it may be useful not to punch summary cards, but to use some of the six 
counters to accumulate the progressive totals as they are being formed. Thus four progressive 
totals formed in counters I and 2 could be added in counters 3 to 6. 

The ability to transfer totals from one counter to another (/.e., rolling in Hollerith terminology) 
is required for this method. Although the rentals of rollers are somewhat higher than those of 
simpler machines, the above method is very useful and efficient with a few variables, or when 
no summary punch is available. In certain circumstances it is worth while using this method for 
k> 11. 

The important special cases of one or two variables (plain sum of squares, analysis of variance 
and covariance) are not usually efficient by this method, since it is, as a rule, not worth while 
punching cards for the mere purpose of producing one sum of squares or two sums of squares and 
one sum of products. On the other hand, if the data are already on cards for other purposes (^».g., 
for multiple classification by qualitative variables), methods like the above may prove economical. 
However, other methods of summary multiplication demand critical comparison with progressive 
digiting and, because of its importance, the special case of sums of squares is dealt with by another 
method in the next section. 

(v) The master-card method of producing sums of squares 

Suppose that N observations of a single 2-figure variable are already punched (say in 
columns 1 and 2), that four columns are still available (say columns 3 to 6) and that N is of the 
order 2,000. Suppose also that it is desired to carry out a number of analyses of variances with 
these data so that sums of squares are wanted for various sub-groups, each of which would, of 
course, have a designating code punched in certain columns for sorting purposes. 

We now punch 99 master-cards for the range a == 1, ... 99, punching a in columns I and 2 and 
values of a^ in columns 3 to 6. To distinguish these master-cards from ordinary cards, an X must 
also be punched in some convenient column. The master-cards, followed by the N data cards, are 
now sorted into serial order of a. We then gang-punch the squares from the masters on to the 
data cards. Each card will then contain an observation and its square. The cards are then 
sorted into the experimental sub-groups and group totals of a and a® formed by the tabulator. 
This process does not have a high efficiency factor, as only two fields are being added, but if several 
variables a, b, c .. . are all punched on the same card, and if an analysis of variance is required 
for each, the fact that only one tabular run is required for all the adding leads to an effective process. 

Generalizations of this procedure are obvious. The essential condition is that the range of 
the observations in units of their last figure should not exceed about N/IO, If /I is a convenient 
working mean near the middle of the range, the master-cards could contain a and (a — Ay. 
Hence it is not necessary that the observations should be 2-figure numbers. If the range of 
observations is greater than W20, the observations may be grouped and Sheppard’s correction 
applied. 

Part II. Miscellaneous applications of calculating machines 

In this section we give a number of selected applications of calculating machines to statistical 
problems. The first application (i) is straightforward and, although useful, may not be new. 
The other applications, (ii), (iii) and (iv), are, to the best of my knowledge, new. 



166 


[No. 2, 


Hartley — The Application of some 

(i) The use of adding and listing machines for the calculation of moving averages 

In Tabic V is given a time series of 24 values a, for r — 1, 2, . . . 24 ; it is required to calculate 
the 19 moving averages of 6 — i.e., 

<+ 6 

w, - i S I - 1, 2, ... 19 

r — 

For most purposes it will be sufficient to calculate moving totals, Mi — 6 mi, as the division by 6 
is usually unnecessary, particularly when answers are plotted. 

The method of calculation is to keep a moving total of six values constantly in the adding 
machine, by adding and subtracting a value of the series at each stage. This is illustrated in the 
tape reproduction in Table VI, to which the following paragraph refers. 


Table V 


Table VI 


Time Series 

•361 

•359 

•351 

•360 

•349 

•356 

•312 

•278 

•210 

•208 

•223 

•198 

•216 

•151 

•139 

•162 

•210 

•236 

•287 

•223 

•235 

•312 

•378 

•415 

6-529 T 


Calculation of moving totals for time series in Table V 


•361 

•198- 

•359 

*236 

•351 

1*114 S 

•360 

•216 - 

•349 

•287 

•356 

1185 S 

21 36 5 

•151 - 

•361 - 

*223 

•312 

1*257 S 

2*087 S 

•139 - 

•359 - 

•235 

‘in 

1*353 S 

2 006 S 

•162 - 

•351 ~ 

•312 

•210 

1-503 5 

1*865 S 

•210 - 

•360 - 

•378 

•208 

1*671 5 

1-713 S 

•236 - 

•349 

•415 

•223 

1*850 S 

1*587 5 

•287 - 

■356 - 

•361 

•198 

1*924 .V 

1*429 S 

*223 - 

•312 - 

•359 

•216 

2*060 S 

1-333 S 

*235 - 

•in - 

•351 

•151 

2*176 S 

1-206 S 

•312 - 

•210 - 

•360 

•139 

2*224 S 

1135 5 

•378 - 

•208 - 

•349 

•162 

2-195 S 

1 089 5 

•415 - 

•223 - 

•356 

•210 

2- 136 S 

1 076 S 

2136 T 


Set, print and add the first six values of a. Sub-total to print Mi = 21 36. Set ai and subtract,, 
set at and add, sub-total to print Mg = 2 087, and so on until M 19 = 1-850 is formed. We now 
continue cyclically by introducing six dummy values ags = = ‘361, . . ., Og# = = *349, 

a^o - at ^ -356. This gives A/g© = 1*924, and so on until the last moving total Afgs — 2136 is 
produced, when the machine is totalled. To facilitate the picking up of the values au ; «g, Og — 
/.e., pairs of values six time intervals apart— it usually pays to cut a little stencil with two apertures 
exactly six lines apart. 

The work is completely ♦ checked by the identity Afjs = Mi, The last six dummy totals are 
* Except, of course, for exactly compensating errors. 



194 ^ Commercial Calculating Machines to Certain Statistical Calculations 167 

struck out on the tape, leaving a checked printed list of the remaining 19 . Any errors of setting 
are quickly found by reference to the tapes. If conversion into moving averages is desired, it is 
advisable to retain the siji dummies in order to apply an 5 -check after dividing by 6 . 

If moving averages are required for two parallel scries, they can be dealt with simultaneously 
by adding and listing them side by side. (S^ Part I (iii).) 

(ii) The calculation of serial correlations on Hollerith 

Let us assume that there is a time series of 108 observations,* (i), ( 2 ), ( 3 ), . . . ( 108 ), all con- 
sisting of positive 2 -jfigure * numbers. It is required to form all serial correlation coefficients up 
to order 35 * — /.e., we are to correlate observations that are i, 2 , 3 , . . . 35 time intervals apart. 

The time scries is now split into three groups of 36 consecutive observations and each group is 
punched on to 36 consecutive 2 -rigure fields of a card, called Fields I, 11, . . . XXXVI. The 
remaining 8 columns are used for indicative matter. Each of the 3 cards thus punched is now 
placed as a master [see Part I (iv)] in front of a pack of 35 blank detail cards, and the total pack 
of 108 cards thus formed is oifset gang-punched in the manner shown by the ordinary figures in 
Schedule (vii). Copy-punching is in the direction of the downward arrows 


Field 


I 

11 

HI . . 

XXXV 

XXXVl 

1st Master 


_a)_ 

(2) 

(3) 

. " (35) 

(.36) 

First pack of detail 

1 

(2) 

(3) 

(4) ^ 

(5) ^ . . 

(36) 

(37) 

cards 

2 ... 

(3) 

(4) 

: 

{38) 


3 ... 

(4) 

(5) 

(6)^ 

{39) 


33 ... 

(34) 

(35) 

(36)-*^ . . 

A63) 

m 


34 ... 

(35) 

(36) 

(37) 

^(39) 

. ^(70) 

(70) 


35 ... 

(36) 

(37) 

(33) . . 

(71) 

2nd Master ... 


(37r 

(38l 

„ (39) 

(71) 

r_ J(72)IZ 

Second pack of detail 

~T ... “ 

(38) 

(39) 

(40)^ 

(41 )> 

(42) . . 

(72) 

(73) 

cards 

2 ... 

(39) 

(40) 

J73) 

{74) 


3 ... 

(40) 

(41) 

: 

(75) 


33 ... 

(70) 

(71) 

(72) ^ . . 

AJ04) 

(105) 


34 ... 

(71) 

(72) 

(73) . . 


(100) 


35 ... 

(72) 

( 7 : i ) 



{ 107 ) 

3rd Master 


(73) 

(74) 

_ (75) , 

'^’^(107) 

(1Q8 )_„_. 

Third pack of detail 


(74) 

(75) 

(76) ^ . . 

(77) ^ . . 

(78) ^ . . 

' "'V(i08) 


cards 

2 ... 

(75) 

(76) 




3 ... 

(76) 

ill) 




33 ... 

(106) 

(107) 

(108)^ 




34 ... 

(107) 

(108) 





35 ... 

(108) 






Schedule (vii) 


The pack is then turned round bodily and passed through the reproducer a second time in 
the reverse order to complete the blank fields. The third pack of detail cards would pass without 
a master and, if desired, can be needled off and withdrawn from the run. 

The third master will precede the second pack of 35 detail cards and the second master the 
first pack. The effect of this run is shown by the italic figures; copy-punching proceeds in the 
direction of the upward arrows. 

Fields I and 11 will now contain observations one time inter\al apart; Fields 1 and III will 
contain all observations two time intervals apart, and so on. The remainder of the work consists 
in producing the sum of products of Field I into itself and into all other 35 fields by one of the 
methods described in Part I (iv). 

A machine ( 8 ) built specially for this type of work by assembling Post Office relays* is said ta 
have a theoretical speed of i second per product. 

Under conditions like those here postulated, the Hollerith method is very much faster. Also 

* These are the most favourable conditions for the Hollerith method about to be described, but the 
necessary modifications for different requirements are obvious from the description of this special case. 



168 Hartley — The Application of some [No. 2, 

the great merit is that these machines are available in larger numbers to statisticians, whereas 
access to the special machine mentioned is confined to a small group of workers. Moreover it 
can, by a mere change of plugging, be used for a variety of other calculations. 

(iii) Numerical quadrature on the National machine 
(a) The problem 

By numerical quadrature we mean the tabulation of the indefinite integral / f(x)dx of a given 

Ja 

integrand /(x) as a function of the upper limit jcq; geometrically the problem is to tabulate the 
area between a given curve f{x) and the jc-axis. The example chosen is one for which the answer 
is known. In the graph below of - sin x the area between this curve, the x-axis and the verticals 

aix ~ — ^ and x: == Xq is equal to cos Xq. The calculation is, therefore, one of a more fundamental 

mathematical nature, and not one that is likely to occur frequently in the work of the practising 
statistician. On the other hand, it is the fundamental operation in the tabulation of distribution 
functions and their integrals in every statistical table. 



The most importarit example is that of producing a table of the probability integral from a 
given table of a distribution function. Moreover, most of the distribution functions in statistics 
are, by virtue of their definition, given in terms of integrations over a specified sample space. As 
is well known to mathematical statisticians, the occasions when these integrals can be found by 
analytical transformation and integration are the exception rather than the rule; more often 
than not an expression for the probability integral or distribution function of a statistic involves 
quadrature. The method to be described has already been applied to two large-scale tabulations 
of probability integrals (one of which has been published recently) — namely : 

(A) The distribution of the mean deviation in samples from a normal population (11). 

(B) The distribution of the average of eccentricity-readings (distribution of mean x for 
two degrees of freedom). 

In the case of (A) the extent of the tabulation undertaken would have been hardly practical 
without this method. 

The method is particularly suited to the evaluation of functions defined by quadrature-recur- 
rence, as it will produce in turn a printed table of each function required. See, for instance (11) 
and (12). It is also suitable for the numerical evaluation of integrals of two or more variables, 
which will be briefly referred to below. 

(b) The method 

The National accounting machine, which is used for this method, has been described in detail 
in this Journal (4), so no further description is given here. 

With numerical quadrature the integrand /(x) must be given numerically, usually in form of a 
table at equidistant tabular intervals w. More often than not this table must first be calculated 
from a formula. In the present method we calculate a table of g(x) = T,\wf(x) rather than of fix). 
With every method of numerical quadrature the integrand should be checked first ; the usual way 



194Q Commercial Calculating Machines to Certain Statistical Calculations 169 

is by differencing. By a special arrangement on the National (shown below), we produce all the 
differences of gix) up to the fifth and also the double fourth difference Ao*"" + Al*^ which is required 
later. 

Differencing of — ^ decimal * 


X 

-^■|sin A 

A'* 

~(A.'v + A/v) 

A'v 

A'" 

A" 

A' 


-•2 

82779 









— 



. 82779 

—1 

41597 

— 

— 

— 

— 

— 

9958818 

41597 

•0 

0 









9999585 

9958403 

0 

•1 

41597 

— 

— 

— 

415 

0 

9958403 

9958403 

•2 

82779 

— 

— 

0 

415 

415 

9958818 

9917221 

•3 

123133 

9999998 

2 

9999998 

413 

828 

9959646 

9876867 

•4 

162258 

9999990 

14 

9999988 

401 

1229 

9960875 

9837742 

•5 

199761 

4 

20 

9999992 

393 

1622 

9962497 

9800239 


* Two prints of to the right of arc not given in this table, as these prints can in fact be 
suppressed when using closely spaced non-print stops. 

The stops and schedule of operations for this process arc shown below. (S.T. — Sub-total, 


T. = Total) 

Stop 

0 

4.1 

3 

-1 

-1 

0 -14-4 

-14-2 

Oper. 

Set x 

Set gi(x) 

S.T. 1 

S.T. 3 

S.T. 3 

T. 1 S.T. 3 

S.T. 4 

Print 

x 

giix) 

AV 

Ao*’ 

Ao‘- 

4- Ao*'^) Ao*^ 

A*'" 

Stop 

Oper. 

Print 

-14-6 

S.T. 2 
AF 

-1 4-5 

S.T. 6 
All' 

-1 

S.T. 5 

?a(A) 






The advantage of having the above finite difference table of g{x) rather than of f(x) is that 
fractional coefficients in the quadrature formula can be avoided. We use the formula for the 
increment in the integral / when increasing the upper limit from Xo to a'i (this increment is marked 
by an arrow in the figure). 

This is given in the standard text-books as 

A/ - w{i(/„ +/.) -- 1 (A," + A/O + ,-!^(A„>' 4- Aj'v) - . . 

191 

in which /i ^ /o(a), and the first term neglected is i- or approximately 

—^(Aq''* j Ai'‘). If we deliberately increase the coefficient fo F^O’ thereby neglect 

1 '^4 

•ppjQ (Ao‘'^ + Ai"’), and remember that / ^ we fij^d that 

A/ -- 12(go -1- gi) - (Ao" -f A,") + 0'2(A„*v -f A,‘v) 
where the A’s are now those of the g's. If we regard the integral as being made up of three com- 
ponents, A, B and C\ arising from the three terms in A/, we note that the fifth difference of A 
(call it is 12(Ao‘'' + Ai‘''), the third difference of B (call it 8"') is ~ (A„‘'' -f Aj*''), while the first 
difference of C (call it 8") is 0-2(Ay*'^ -f- These three differences are all simple multiples of 

the same double difference Ao*'^ 4 - Ai*'^. 

By a special arrangement of stops (shown below) the sum I A -\ B -\ C can be built up in 
one finite-difference integration by feeding the input differences 8"' and 8' into the National 
on the three first stops of each cycle of operations. 8"' ~ — (Ao^’' 4- Ai‘^) is simply read from 
the difference table of g(A), and its multiples (12 times for 8^ and 0*2 times for 8') are read off from 
an auxiliary table of multiples of 12 and 0-2 with argument (Aq"" 4- Ai‘^) ranging from — 100 to 
4- 100. 

Usually, as in the example, the term C is negligible, and 8' need not be fed into the machine. 
The quadrature shown below yields cos x to 5-decimal accuracy. The last three decimals are not 
reliable (see rounding-off errors mentioned below), and a 5 in the sixth decimal has been added 
for automatic rounding off ; in the specimen the sixth, seventh and eighth decimals of the function 
are not printed. 



170 


Hartley— 7%^ Application of some 


[No. 2, 




Quadrature of 

— sin X 







Mis. + s^^ 


12(A*' + AijO 



a-'* 

i 2 (V' + V') 

12 (A 4 - 


~(A|-' + All"') 

cos X 

X 

0 



— 





1 00000 

0 

2 

99999976 

4980 

99500421 

9936 

99005824 

0-99500 

0*1 

14 

99999832 

14916 

98506245 

9768 

99020754 

0*98007 

0*2 

20 

99999760 

24684 

97526999 ' 

9528 

99045458 

0*95534 

0*3 

27 

99999676 

34212 

96572457 

9204 

99079697 

0*92106 

0*4 

38 

99999544 

43416 

95652154 

8748 

99123151 

0*87758 

0*5 

43 

99999484 

52164 

94775305 

8232 

99175358 

0*82534 

0*6 

51 

99999388 

60396 

93950663 

7620 

99235805 

0*76484 

0*7 

56 

99999328 

68016 

93186468 

6948 

99303877 

0*69671 

0*8 

61 

99999268 

74964 

92490345 

6216 

99378902 

0*62161 

0*9 

The stops and schedules of operation for this process are shown below : 



Stop .. 

• 2 

4 

2 6 

1 

3 

0 

0 

Oper. .. 

Set 8 '"j 

Set SjV 

S.T. 1 S.T. 3 

S.T. 4 

S.T. 2 

S.T. 6 

Set Arg 


It should be noted that none of the high-order differences is rounded off, and accumulation 
of rounding-off errors arises only in the summation of the integrand values. This is inherent in 
every process of numerical quadrature. Its maximum effect on the last figure of the summation 
is {n, where n is the number of steps in the integration. 

At the end of the quadrature tabulation the contents of the register containing 12(g^o + gi) — 
(Ao'' f A,") should agree exactly with the corresponding (check) value worked out from the 
difference table of g» This checks the input of S'" and S% and the operation. A similar check is 
applied to the input of S' when this is required. If the tabulation is lengthy, the check should 
be applied at intermediate arguments. 

When applying this method to functions of two or more variables we integrate for each 
variable in turn. For instance, to evaluate Slf(x,y)dx dy we would first produce, for each y- value 

in the X, ^'-grid, a table of the integral I f{x,y)dx. For each ;co-value in the grid these integrals 
would then be integrated over y. The original integrand function to be calculated would be 
where w and <o are the intervals of integration for x and y. 

With this method of quadrature, multiple integrals can be handled just as easily as multiple 
sums, and it is hoped, therefore, that it will help in the calculation of new distribution functions 
of small sample theory when evaluation by analytical methods is difficult and numerical evaluation 
has so far been shirked because of the amount of computational labour. 

(iv) The solution of simultaneous equations on Hollerith machines 

The method of solving simultaneous linear equations here presented is efficient only if the 
number of variables is greater than 20 . If several systems of simultaneous equations have to be 
solved at the same time, the method will deal efficiently with sets of equations with as few as five 
variables. 

Equations involving a large number of variables arise, for instance, in the adjustment of 
triangulation surveys, whilst in statistical work one is often faced with the task of solving a whole 
set of systems of linear equations Avith the number of variables in each system small or moderate. 
Only broad principles of the method can be described here. 

The present Hollerith process is the straightforward successive elimination of variables. The 
principle is to transform the matrix of coefficients into the triangular form as shown by the first 
three of a set of fifty equations in the pattern of coefficients shown in Table VII. 

Table VII 


£<j|uation 



Unknown Number 



1 

2 

3 

4 . . . 

50 

S 


2*361 581 

—653 584 

•356 981 

-1*854 386 . . . 

•035 189 

7*858 392 

( 2 ) 

0 

3*581 735 

—035 786 

•985 351 . . . 

-1*386 584 

•895 218 

(3) 

0 

0 . 

2*238 516 

-1*002 531 . . . 

•316 584 

5*318 780 

(4) 

< 56 ) 

—318 584 

•567 814 

1*003 516 

•318 501 . . . 

*001 384 

2*386 009 



m 


194 <$| Commercial Calculating Machines to Certain Statistical Caicuiations 

In these three ^nations all coefficients to the left of the diagonal are 0. This form of the 
equations is sometimes called the reduced form. [Once all equations have been transformed into 
this reduced form the matrix is completely triangular, X50 is given by the last equation, and the 
remaining unknowns are calculated in turn (JC49 . . . from the other equations in a process 
known as back solution. This process constitutes only a small part of the work and is not dealt 
with here.] 

In order to transform equation (4) to this form, we must multiply equation (1) by the ratio 
2*Wi finding for the first two coefficients 0 and *479 644 

•470 644 

respectively. We next multiply equation (2) by 735 = ““ *^33 914 and add it to the 

other two, and so on. It is obvious that the main arithmetical operation is the multiplication of 
all the coefficients in a reduced equation by each of a large number of multipliers. For instance, 
all coefficients in the reduced equation (1) must be multiplied by *134 903 to reduce equation (4), 
and similarly by the 48 other multipliers required for reducing the other equations. As soon, 
therefore, as a reduced equation is formed, we cater at once for all the multiples of its coefficient^ 
required in all subsequent eliminations. This is done by producing “ready-reckoner” cards 
giving the following “ grocers-weight ” system of multiples (m) of each coefficient: 

10, 20, 40, 80 Also ~ 160 

1, 2, 4, 8 

01, 0*2, 0-4, 0-8 

0*000 001, . . ., 0*000 008 

One Hollerith card will cater for five consecutive coefficients of the equation — e.^., the first card 
for /w = 2 in equation (1) will provide twice the first five coefficients of this equation — /.e., the 
quantities 

4*723 162, 999 98*692 832, *713 962, 999 96*291 228, 2(coefrt. of x*) 

will be punched on this card. The multiples of the next set of five coefficients would be given on 
a second set of ready-reckoner cards, and so on. Any multiple of the coefficients of equation (1) 
can now be built up from these cards. For instance, to obtain *134 903 times the reduced 
equation (1) we would have to select the ready-reckoner cards corresponding to am = * 1, *01, 
•02, *004, *0008, *0001, 000 001, and *000 002. The multiplier to be applied to the 
second equation would be — *133 914. Such negative multipliers are throughout in this pro- 
cess printed as complements to 100000 so that the above multiplier would be shown as 99 999 866 
086. We would therefore select the multiples — 160, 80, 40, 20, 10, 8, 1, *8, *04, *02, *004, 
•002, *000 08, *000 004, *000 002. All cards for /w == — 1^, 80, 40, 20, 10 would be picked 
as one pack. 

These cards are now added in the tabulator. Five counters are devoted to the reduction of 
five consecutive coefficients of equation (4). Into each counter we enter first the original coefficient of 
equation (4) (e.g., *318 501 is entered into the counter 4), then add to it the contributions from 
the ready-reckoner cards of the first, second and third equations. The results of this tabulation are 
the first five coefficients of the reduced equation. We may denote these by 04^2 ^4.s and 
^«4*6 (actually ^4,1 =- ^4,2 = 04.3 = 0). At this break of control the tabulator will immediately 
summary-punch these five new coefficients (offset) thus making a card containing 

i^^4, 1 1^^4, 8 • • • 6 

/.e., the (new) ready-reckoner card for m = 10. The tabulator will then roll the contents of each 
counter into itself, thus doubling the contents of all counters, and then summary^punch 2004, 1 • • • 
20^4, 5 on to a new card, thus making the ready-reckoner card for m = 20. Two more such cards 
(those for /w — 40 and m = 80) are made in the same way, and finally all counters are doubled a 
fourth and last time to form 1604,1 . . . 1604,5. This time, however, we arrange to summary- 
punch the complement of the counter contents (offset), thus producing the ready-reckoner card 
for m = — 160, after which the counter is cleared. The plugging for this operation is very 
special, and certainly non-standard. 

SUPP. VOL. VIII. NO. 2 


H 



172 Hartley — The Application of some Commercial Calculating Machines [No. 2, 

The tabulator will now immediately proceed to add the card with the next five coefficients of 
equation ( 4 ) . . . -^^ 4,10 say), followed by the appropriate ready-reckoner cards selected 

from those of equations (1) to (3), and thereby produce the next five coefficients 04, 10 

and so on until £? 4 , 46 , • . • so arc formed. TTiis completes the reduction of equation (4) and 
the punching of the ready-reckoner cards corresponding to w — 10, 20, 40, 80 and — 160 for all 
coefficients in this equation. 

The sixth counter of the tabulator is reserved for checking purposes. It will add the indices 
m of the ready-reckoner cards in such a way that all multipliers are formed and printed, so that 
the operator will be able to see if any wrong ready-reckoner cards are in the pack. 

To produce the remaining ready-reckoner cards we use the simple principle of copy-punching 
with digit shift in the reproducer punch (see Part 1 (iv) (a)). From each card m— 10 we copy- 
punch the corresponding cards for /« ==- 1, 0*1, . . . 0-000 001 (and so on for 20, 40 and 80) 
by one offset gang-punch run. This produces all the ready-reckoner cards for the reduced 
equation (4). 

Nothing has been said about the formation of the multipliers to be applied to each reduced 
equation and the method of selecting the ready-reckoner cards required for each reduction. 
Indeed, in the above example the multiplier for equation (?) is not known until the contribution 
from equation (1) has been added to the second coefficient. The^ and other difficulties have 
been overcome in the case of a general matrix, but we must confine ourselves here to a special 
case. In most statistical applications we are dealing with a symmetrical matrix that has arisen 
in the course of a least squares solution. Here the process of elimination follows the Gauss- 
Doolittle method (see, e.g., (14)) with the advantage that the multipliers to be applied to each 
reduced normal for the reduction of the subsequent normals are known as soon as each reduced 
normal is formed. They are the coefficients of the reduced normal, divided by its diagonal term, 
and are produced on the tabulator (with virtually no change of plugging) by selecting the ready- 
reckoner cards corresponding to the reciprocal of the diagonal term formed on a calculating 
machine. 

All selection of required ready-reckoner cards is done by hand-picking from an ordered file 
with tabbed guide-cards, so that the cards for all coefficients in an equation are picked as small 
packs, and not singly. The refiling of all ready-reckoner cards after each reduction is done 
automatically on the sorter. It is convenient to work with two copies of the ready-reckoner 
cards, tabulating cards from the one whilst picking from the other the cards required for the next 
reduction. 

There is no essential limit to the number of variables that can be dealt with. The digital 
accuracy in the applications so far made is six decimals with two digits in front of the decimal 
point for all coefficients and multiples thereof. 

The process has been completely tested, but so far only in experiments. It promises to be 
about four times as fast as the elimination method carried out on desk calculators. At the time 
of writing, preparations are being made to solve an actual case of 28 normal equations for 28 
unknowns. 

For equations with a large number of unknowns (say 50 or more) the method of hand-picking 
may become laborious. A method in which the selection of the required ready-reckoner cards 
is completely automatic has been developed, but this is not economical when the number of 
unknowns is small. 

In conclusion 1 would like to acknowledge with gratitude the great improvements which 
Dr. L. J. Comrie has made in the presentation of this paper, which, incidentally, resulted in a 
20 per cent, cut of the text ! 

References 

( 1 ) Baehne, G. W. (Editor), Practical Applications of the Punched Card Method in Colleges and Universities, 

Columbia University Press, 1935. 

(2) Brandt, A. E., Punched Card Method in Colleges and Universities^ pp. 423-36, 1935. 

(3) Comrie, L. J., The Hollerith and Powers Tabulating Machines, Printed for private circulation, 1935. 

Now out of print. 

(4) Comrie, L. J., J, Roy. Stat. Soc., Suppt., Vol. Ill, No. 2, 1936. 

(5) Comrie, L. J., Hey, G. B. and Hudson, H. G., J. Roy. Stat. Soc., Suppt., Vol. IV, No. 2, 1937. 

(6) Comrie, L. J., Calculating Machines. Sir Isaac Pitman and Sons, 1938. 

(7) Comrie, L. J., and Hartl^, H. O., Modern Machine Calculation with the Facit Calculating Machine, 

Model LX. Scientific Computing Service, Ltd., 1939. 



173 


194 ^ Discussion on Dr, Hartley's Paper 

(8) Cunningham and Hynd, J, Roy, Stat, Soc,, Suppt,, pp. 62-85. 

(9) Dwyer, Paul S., Journal of American Statistical Association^ Vol. 27, pp. 279-86, 1932. 

( 10) Eckert, W. J., Punched Card Methods in Scientific Computation, The Thomas J. Watson Astronomical 

Computing Bureau, Columbia University, 1940. 

(11) Godwin, H. J., and Hartley, H. O., Biometrika, Vol. XXXITT, Part lU, November 1945. 

(12) Hartley, H. O., Biometrika, Vol. XXXIII, Part II, August 1944. 

{13) McMullen, L., Biometrika, Vol. XXX, Parts III and IV, (1), January 1939. 

04) U.S. Coast and Geodetic Survey, Manual of Triangulation Computation and Adjustment, Special 
Publication No. 138. 


Discussion on Dr. Hartley’s Paper 

The Chairman. It gives me particular pleasure to move from the Chair that a vote of 
thanks be accorded to Dr. Hartley for his interesting and stimulating paper. Dr. Hartley first 
came to my notice in Cambridge as a pure mathematician, and 1 must take some responsibility 
for directing his activities, firstly to mathematical statistics ; secondly to applied statistical work 
both at the Cambridge School of Agriculture and at the National Poultry Institute at Harper 
Adams Agricultural College; and thirdly to professional work on computation. In this Dr. 
Hartley was one more victim of the method I have practised in turning out statisticians. The 
method can be described very briefly. You catch a first-class mathematician, teach him the theory 
of statistics, then put him into a working laboratory which is performing an advisory function to 
experimental departments, where contact with the practical work not only stimulates tne researches 
of the mathematician by bringing him up against the theoretical problems for which a solution is 
needed, but also promotes methodological and experimental advances, which depend on develop- 
ment of theory. 

Dr. Hartley’s first connection with the more elaborate forms of computation came about through 
my connection with the Mathematical Tables Committee of the British Association, which he got 
out of a difficulty in 1938 by serving for a time in the guise of computer, mainly on the National 
6-Register machine. It is no accident that statistics and computation are associated. At the 
beginning of this century, when Professor Karl Pearson was doing his epoch-making work, 
numerical mathematics had not advanced beyond the limited stage needed by the ordinary 
mathematician and astronomer at that time, and it soon became evident that statistics required 
the evaluation of complex integrals by numerical methods, the construction of extensive 
mathematical tables, and the study of interpolation methods in one, two and more variables. 
Calculating machines, too, were in their infancy, and their use in the relatively routine calculations 
involved in summing squares and fitting frequency curves to grouped observational data did a 
great deal to popularize their use in scientific circles. Professor Karl Pearson used to tell the story 
of the mathematician, who shall be nameless, who came to him with an integral which he could not 
evaluate. Could Professor Pearson suggest a method? Professor Pearson’s reply, as he twiddled 
the handle of the Edwardian model of calculating machine on his desk, was that he had never met 
an integral that he could not evaluate. 

The development of the science, or should I say “ art,” of computation led in the ’thirties to the 
formation of l&ientific Computing Service, Ltd., from which first Dr. Comrie came in 1936 and 
now Dr. Hartley has come to communicate to this Section their contributions to the art; and to 
the establishment at Cambridge of a Mathematical Laboratory as a special Department of the 
Faculty of Mathematics. More recently a Mathematics Division of the National Physical 
Laboratory has been set up at Teddington with a special statistical section, and somewhat similar 
activities to those being carried out in Cambridge are in operation in Manchester and Liverpool. 

We are concerned to-day with computation applied to statistical calculations. These may 
range from the very simple to the complex, more perhaps because of the repetitive and arduous 
nature of the calculations than of their elaborate nature. Two extreme forms may be quoted from 
my own experience. In the ’twenties, when the new forms of experimental design in agriculture 
due to Professor R. A. Fisher were being tried out, I was roped in to go down to the Fens to help 
harvest some plots, in order that the statistician might be impressed with the amount of trouble he 
was causing the experimenter. It was a point of honour with me to produce the analysis of variance 
calculations, down to the final result, before the train reached Liverpool Street on the way home, 
and there was only the back of an envelope on which to work. The trick was to use the ” dropped 
constant ” method and to keep the observations to two figures ranging between 40 and 60, using 
the formula for (50 t xy to work out the squares. Square roots were not quite so easy, but there 
was very little of this to do. At the other extreme we had before the war the ” Greg bequest ” 
experiment carried out by Dr. Hudson and Mr. Hey to throw light on the experimental errors, 
including sampling errors, for different sizes of plots and sampling units. Helped by Dr. Comrie, 
this was quite a complex piece of work using Hollerith machines, which turned my laboratory into 
a miniature factory for the time being. 

There arc a number of points in Dr. Hartley’s paper which should stimulate discussion. I 



174 


Discussion on Dr. Hartley's Paper [No. 2i 

hope I shall not be thought ungrateful, where so much has been given, if I su^ge^ that more can 
be said about analysis of variance in its computational aspects than appears m the paper, vwiw 
it is implied that because it is only a case of summing squares, it is a special case of the methods 
developed for dealing with multi-variate rejpession. The applications of the analysis of vanan^ 
technique are to-day fairly numerous, and rightly so, and it would be helpful to put on record the 
experience which has been gained in systematizing the computations. 

Mr. E. C. Fieller: I have much pleasure in seconding this vote of thanks, and I should like 
to congratulate Dr. Hartley on a paper that has the clarity that we have all come to expect from 
the Scientific Computing Service. Later speakers from the National Physical Laboratory can 
discuss the Hollerith and National Machines with much more intimate knowledge than I possess, 
and therefore 1 do not propose to talk at length. I have, however, one query, and one quarrel. 

My query concerns the final section of Dr. Hartley’s paper : what does he advise us to do, if 
the interval at which we have to tabulate an integral is so wide, that the fourth differences of the 
integrand arc not negligible? Possibly the practical answer is, that such a situation rarely arises ; 
a reasonable table ought to be interpolable by means of a second-difference formula, and if third 
differences of the integral are negligible, so are fourth differences of the integrand. This is not a 
complete answer, of course ; a table in which third differences are not negligible can still be inter- 
polated by means of Everett’s second-order difference formula. 

My quarrel is with one of Dr. Hartley’s opening remarks : “ many statisticians loathe figures.” 
A statistician, surely, is somebody who makes sense of numbers, and 1 do not see how he can do 
that if he hates the sight of the things. 1 do not deny that there is a very useful class of people 
who dislike arithmetic themselves, but justify their existence by telling other people how to do it. 
To my mind, however, they are more appropriately described as statistical mathematicians ; the 
statistician himself need not be good at arithmetic, since he can call on a multiplicity of aids of the 
type that Dr. Hartley has described and developed, but to do good thinking he must be interested 
in his figures, interested in the subjects they refer to, and interested in the results that his calculations 
produce. 

The vote of thanks was put to the meeting and carried unanimously. 

Mr. Mandeville, after adding his personal thanks for the paper and congratulating Dr. 
Hartley upon the ingenuity shown in the application of a Hollerith Tabulator to the solution of 
simultaneous equations, said that he felt that most of those who designed unusual applications of 
Hollerith were in fact standing on the shoulders of Dr. Comrie. Fifteen years ago he learned 
nearly all he knew of the Hollerith system from Dr. Comrie, and from his encouragement to consider 
all problems from first principles he had ever since been grateful. Hollerith machines were 
basically very simple machines, and the first principles of application were therefore simple. The 
machines were operated largely by a series of circuits similar to those used in the ordinary electric 
bell, and in fact they were often less complicated than the house bell. Mr. Mandeville showed on 
the lantern screen a diagram of the ordinary bell as compared with the circuit of the Hollerith 
distributor, pointing out that the distributor was the less complicated of the two. The great 
thing, he insisted, was to refuse to consider a mass of interwoven plugs and just to trace out one 
plug or circuit at a time. 

He then showed a diagram illustrating how by the consideration of first principles a Hollerith 
tabulator could be used to multiply totals, and so weight results derived from additions of 
quantities recorded on cards. In the case of the National Farm Survey, in which stratified samples 
were taken from the records of various classes of farms, the theoretical proportions of the samples 
were modified for the sake of using the tabulator more effectively. Or A class farms 5 per cent, 
were taken instead of, say, 7 per cent., of B class lo per cent., of C class 25 per cent., of D class 
SO per cent., and of E class 100 per cent. The Hollerith tabulator could then add the acreages and 
other quantities relating to any dass of farms, and produce totals weighted to give the 100 per cent, 
estimate for that class. This was done by using the distributor of the Hollerith tabulator to alter 
the position in which a quantity was added into a counter one digit to the left. This multiplied the 
total by 10. Thereafter advantage was taken of the capacity of the tabulator to transfer totals, 
and the fact that when totals are transferred from a counter back into the same counter the total 
in the counter is doubled, to double or quadruple the totals in the counters. Thus the totals for the 
A class farms were entered one place to the left in the counter by means of the distributor, and 
thereafter doubled by transferring them into themselves, which multiplied the 5 per cent, sample 
by 20 and gave the 100 per cent. The other classes were dealt with using one or both of these 
features of the tabulator to give weighted totals. 

This was a good illustration of the statistician modifying the design of a piece of work to suit 
the machine, and thus making unnecessaiw hundreds of hours of hand calculation. It was rendered 
possible by a clear understanding of the nrst principles of the machine to be used. 

Mr. J. Todd was sure they were all greatly indebted to Dr. Hartley for revealing the latest 
ingenious devices of the computer in this field. Mr. Todd’s interest in these matters began only 



175 


1946 ] Discussion on Dr. Hca^tky^s Paper 

durinff the war, and the organization with which he was connected had only a small amount of 
statistical work. His expenence was, therefore, very limited, particularly as it was his pleasure to 
give problems in computing and statistics as soon as possible into the capable hands of his colleagues, 
Mr. Sadler and Dr. Vajda. There were two points he would like to make, but before doin^ so he 
wished to put some comments on behalf of Mr. Sadler, who was unable to attend the meeting. 

Mr, Todd then read a summary of the following contribution from Mr. D. H. Sadler. 

“ I wish to congratulate Dr. Hartley on the ingenuity which he exhibits in adapting calculating 
machines for scientific computation. Three methods in particular promise to be of considerably 
wider application than his immediate illustrations — namely, off-set gang-punching, the use of the 
National machine for quadratures, and solution of simultaneous equations by Hollerith machines. 

“It seems difficult to understand, however, why Dr. Hartley should have used the quadrature 
formula based on function values at the tabular points for which the integral is required: the 
similar formula based on function values at half-way points is much more convergent and offers a 
more powerful method, which in fact illustrates his “ tricks ” more effectively. 

“ The formula is (in the notation of the paper) : 

in which the first neglected term is about -f compared with If now the co- 

I't 16*8 1 

efficient — is approximated to by — (/.e., making an error of + as compared 

with — Hartley’s formula), the integral can be built up from the following multiples of 

some high-order difference of g(Af) = ^ m/(.v) : 

+ 24; +1; -0*07 

“ The principal objection to the use of the “ half-way ” formula is that it may require the special 
calculation of the function values at the half-way points ; but Dr. Hartley’s method implies that 
g(jr) will be specially calculated and, in general, there is no more difficulty in calculating for one 
value of X rather than another. The advantages are that no special difference “ set-up ” on the 
National machine is reauired, integration can be made from any convenient difference and the 
calculation of initial and check values is simplified. Further, the accuracy is greater and the last 
term can more frequently be neglected. The only disadvantage is that an extra fictitious figure has 
to be used in the integrations. 

“ It is interesting to compare the errors of the two alternative methods : (i) Dr. Hartley’s at 
interval w and (ii) the “ half-way ” method at interval 2w, If E represents a rounding-off error and 

N decimals are retained in g(jc) (which of course will be ^ wf{x) in case (ii)), the errors in the 

increment of the integral in the two methods are : 

(i) 24 £.10-^ - H- - • • • 

(ii) 24 £.10-^ + ^ + . . • 

where and f''^ are the fourth and sixth derivatives and the coefficients of /"'* are approximate. 

“ Provided the derivatives are of the same order and except for very small number of figures 
with a correspondingly large interval, the critical term of the error of (i) is the second and not the 
third. It will thus seen that, in general, method (ii) can be used at double the interval of method 
(i), with the added advantage that there will be only half as many intervals and half as many errors 
to accumulate.” 

Mr. Todd’s first point concerned the problem of the solution of simultaneous linear equations, 
and indeed matrix arithmetic in general. He was not convinced that the capabilities of punched- 
card equipment here had been fully exploited. There had been nothing here so striking as Dr. 
Comrie’s discovery of the possibilities of the National. It seemed reasonable to ask a mathematician 
to devise some new methods of solving problems in this field which were particularly suitable for 
putting on machines. The idea of solving a problem under assigned restrictions was quite an 
acceptable one, and the difficulty lay in formulating those restrictions. 

Admiralty Computing Service (A.C.S.) began an investigation of methods of determining 
characteristic values for systems with a large number of degrees of freedom which were suitable for 
use with ordinary calculating machines. Dr. Aronzajn had provided them with some new and 



176 Discussion on Dr. Hartley's Paper [No. 2, 

very powerful methods, which Dr. Fox had tried out and found remarkably efficient. Ui^ 
fortunately, Dr. Aronzajn returned to Paris before he had written up this work, and so far had 
not sent a full account. Dr. Fox could supply some more information. Since A.C.S. would 
become more mathematical in future, they had passed their information to the Oscillation Sub- 
committee of the Aeronautical Research Council, who had in turn passed it to the Mathematics 
Division of the National Physical Laboratory for further development. 

Secondly, there was a problem which confronted A.C.S. during the war and which might oc^ 
again in the case of Industrial Establishments, and on which Dr. Hartley’s views would be valuable. 
Briefly, it was this : when does one install a punchcd-card equipment and how much? (Actually, 
they decided not to, and borrowed machines when they needed them.) 

More precisely the decision was between ordinary machines (including Nationals), punched- 
card equipment (complete or incomplete) and electronic machines. The advantages and dis- 
advantages of the first two were well known ; of the third they had little information, and perhaps 
only faith. 

The problem was to measure efficiency or economy in the large — was the measure “ time ”? 
To take some examples. Suppose they had decided to have punched-card equipment : should they 
dispense with a collator and do more sorting, or reproduce their pack and avoid back sorting, or 
should they dispense with a multiplying punch and use a tabulator, as suggested by Dr. Hartley ? 

Could these problems be formulated more precisely, and was there a reasonably quantitative 
answer? Or should they just call in Dr. Hartley to advise them ? ' 

Dr. H. G. Hudson wished first to express his appreciation at being allowed to be present to 
hear Dr. Hartley’s most interesting paper and to take part in the discussion. There were a number 
of visitors there and he was sure they would wish him to join their thanks with his. 

The Chairman had mentioned the value of commercial calculating machines in analysis of 
variance work. In interpreting the results of field trials in Agriculture this method of analysis 
was often used, but the number of plots, and thus the number of figures to be analysed, rarely 
exceeded 8 o, and was frequently bSetween 20 and 30 . Modern methods of experimentation, 
particularly those developed by Dr. Yates and called “ confounded,” meant that the plot totals 
had to be collected into sub-aggregates, and it became necessary to obtain the sums of squares 
relating to each sub-aggregate. The numbers of figures in each sub-aggregate were small (often 
only 2 ), but there might well be a considerable number of them. The computation then took 
ratner a different aspect — that of many small calculations of varied nature rather than a few larger 
ones. He would not like Dr. Hartley to leave them with the impression that commercial 
calculating machines were of no value in this field ; they were, in fact, indispensable ; but rather 
different qualities were desirable, and unless a very large number of similar analyses were to be 
done, Hollerith machines were not as useful as desk calculators. 

During the war we were told that our fighter aircraft were pre-eminent not only because of their 
high maximum speed, but rather because of that combination of great speed allied to great 
manoeuverability which they possessed. Similarly, for analysis of variance work a machine must 
not be judged only by its ” theoretical speed,” but also by its flexibility and adaptability. A 
machine must therefore be judged solely in the light of the task which it was to be asked to perform, 
and it did not necessarily follow that the most expensive machine (or the machine with the most 
gadgets) would be the most useful for a given job. For example, there were certain highly developed 
machines on which it was difficult to calculate square roots with speed. Indeed, the speaker 
sometimes found himself wondering if the flexibility of the hand machine (such as the Brunsviga, 
especially that model on which it is possible to transfer numbers from the Product Register to the 
Setting Levers) did not make it preferable to some of the highly developed modem electrically 
operated machines for this work. 

The second essential of this type of computing was careful planning. By using suitable methods 
on the machine (notably those outlined by Dr. Hartley in section B (1)), and clearing at suitable 
intervals, it was possible to form sub-aggregates and sums of squares in the same operation, and by 
assuming their correctness temporarily, methods could be devised whereby they were checked in 
the process of later calculations. With a well-thought-out and planned sequence of operations it 
was often possible to do all or most of the checking without any purely repetitive operations — /.e., 
by what might be called planned checking rather than repetitive checking. 

To Hollerith he had come as one who had had some familiarity at a time in the past. The 
mention of Collators, and even of Summary Punches, had brought home to him just how lon^ ago 
that was, and what progress has been made more recently. He would like to close by asking a 
simple question for information. There was at one time talk of adapting the Rolling Total Machine 
to the analysis of variance and co-variance. Had such a practice been devised; if so, was it 
economical, and if so, how widely was it used? 

Dr. L. J. CoMRiE read the following contribution from Mr. G. B. Hey, who was unable to 
attend the meeting : 

” I very much regret being unable to be present to-day to hear Dr. Hartley’s most interesting 



177 


19461 Discussion on Pr. Hartkys Paper 

paper. Having worked with the author for many years, it is a pleasure to see the publication of 
even a small sample from the many new schemes that have been evolved for handling statistical 
and other computations. 

“ I must disagree with Dr. Hartley’s way, on p. 164, of providing for gaps in a sequence. When 
N is about 30 we would only use a Hollerith if there were a numoer of groups to be done at the 
same time, and with the XY card method it is essential to put each group through the sorter and 
tabulator separately. This is very wasteful of time, and renders the checking rather more 
complicated. A simpler way, which works ideally, is to include a dummy set of cards punched 
8, 7, 6 . . . I, o in any necessary columns, making one such set for each group. This scheme 
usually adds the same total to every sum of products, but this is easily removed, and the entire 
process is quite automatic. 

“ On p. 1 66, where moving averages (but not totals) are reauired, 1 would suggest setting the data 
out in columns of 6 lines each, recording the total of each column. Now set one-sixth on a 
calculating machine, multiply it by the first total and record the product. Without clearing any 
register multiply by the dinerence, formed mentally, between the first values in columns i and 2, 
and record, and so on for the second to sixth pairs. The multiplier register should now contain 
the total of the second column. Continue the process on the second and third columns, ending 
with a repeat of the first column. The sum of the recorded results should equal the total of the ■ 
original values except for rounding off errors. Dummy values are supplied to bring the original 
senes up to an exact multiple of 6. 

“ The effect described on p. 167 could in practice be better obtained by punching 108 cards in 
field XXXVI only and gang-punching once. Dr. Hartley’s scheme seems unnecessarily complicated 
both in theory and practice. 

“ I am intrigued by Dr. Hartley’s scheme on p. 171 for getting 10, 20, 40, 80 and 160 times 
certain quantities punched automatically. This appears to require some internal re-wiring of the 
tabulator, and it would be interesting to hear how the effect is obtained. 

“ 1 would like to make two general observations. The first concerns checking — a matter that 
Dr. Hartley has mentioned on occasions, but which I feel he has not sufficiently stressed. With the 
processes he has described it is often much more difficult to devise a foolproof checking system 
than to plan the actual work. Statisticians and physicists, and even computers, seem to have an 
entirely unwarranted idea of their ability to do, or direct, a set of computations without error, and 
the whole subject of checking is worthy of a paper to itself. 

Secondly, I would like to protest about the difficulties in the way of those who desire to make 
use of the experience and machinery that exist. Computing technique in this country appears in 
many respects to be far more advanced than in America, but owing to the cost, and the difficulty 
of access, particularly to Hollerith machines, the use made of such facilities is much less here than 
in the States. Moreover, where Hollerith is fairly freely available, as in some Government Depart- 
ments,. I know from personal experience that it is often handled in a most inefficient manner, the 
excessive cost to the taxpayer being hidden by the mysteries of interdepartmental accounting. 

“ The methods Dr. Hartley has shown are of great power, and ought to be generally available 
to the scientific world at a price it can afford. The dangers of too much power are, however, 
serious, and the author’s opening quotations on the danger of forgetting fundamentals are very 
appropriate. For instance, a ‘ statistic ’ was devised to summarize the meaning of a collection of 
data, yet with a Hollerith it has happened that the number of ‘ statistics ’ produced has exceeded 
the number of original data. Such is the nature of progress. 

“ In conclusion I would thank Dr. Hartley for his most interesting and stimulating paper.” 

Dr. CoMRiE then addressed a few words to the Section on his own account. He was glad 
that no one had got up that evening to talk about making special machines, because, before they 
made special machines to do these things, it was well to explore the possibilities of existing 
machines. The work about which they had been told was carried out with ordinary commercial 
machines, not particularly meant for statisticians, although suitable for their work. The Hollerith 
and the other machines that had been mentioned could perform, with the right technique, practically 
all that was wanted ; if a person said that he was designing a machine he ought to be asked whether 
he had thoroughly explored those already existing. 

Mr. Boss desired to thank Dr. Hartley for writing up in a convenient form several techniques 
which were not very well known. He wished to put it on record that Dr. Hartley had carried out 
his quite considerable work on punch-cards in the face of great difficulties, with machines which 
had been borrowed, and often in inaccessible places. The fact that these matters were not well 
known led him to another point — that very little work had been done, by veiy few people, in this 
country on such methods at all, subsequent to Dr. Comrie’s pioneer work in the early ’thirties. 
Since the Hollerith machines were improved substantially in the middle ’thirties, very little use had 
been made of them, relatively speaking, for mathematical and statistical work. At the outbreak 
of war there was no modem installation working full time on mathematical or statistical work (he 
meant the type of calculation that Dr. Hartley mentioned in his paper). During the war two 



178 Discussion on Dr. Hartley's Paper 2, 

installations had been set up, the second of which, at the National Physical Laboratory, was less 
than a year old. The explanation of why some of the questions had been asked that evening, ana 
why the methods were evidently not known, might lie in that fact. 

He wished to make three general comments on the paper. First of all, he thought th^ ur. 
Hartley had understated the value of the digital multiplication which he had described. There 
were certain cautions, however, which he would like to mention in a moment. A point which Dr. 
Hartley did not make, and which should be stressed, was that punch-card techniques were very 
different indeed from other methods, and to attempt to argue from the one to the other was most 
dangerous. Furthermore, there was a fundamental difference between commercial technique m 
punch-card work and mathematical technique. As regards details, he did not think it was Dr. 
Hartley’s intention that his remarks on digital accuracy should be applied to punch-card work. 
It helped enormously to work with small digits, but that did not necessarily apply to punch-cards, 
because one very soon got into negative quantities. Dr. Hartley had not stressed the fact that most 
methods talked about in the early part of the paper were concerned primarily with positive integers. 

The method of summary multiplication was very simple, but he thought that certain points 
should be mentioned, including particularly the description given of the tabulator. The standard 
tabulator employed for commercial work had six counters of eleven wheels each. Unfortunately, 
from the statistical angle, the majority of commercial machines had to handle pounds, shillings and 
pence, and therefore the maximum decimal capacity of a counter was sometimes only six wh^ls 
and in certain cases eight. That did not detract from the value of the method : it only made it a 
little slower. 

He had only one comment to make about the Hey tabulator. It was not, unfortunately, as 
young as it used to be, and it gave a certain amount of trouble. It had certain attractive features, 
but the method of linking counters again applied only to positive quantities. 

He was sorry that Dr. Hartley did not round off his picture by saying a little more about the 
Hollerith (multiplier). The machine was certainly slow, but it was a machine which got through 
an immense amount of work in a quiet way. Furthermore, it was the only machine which would 
handle multiplication by many digits satisfactorily, in the sense that it was an 8 x 8 machine. It 
was not generally known that by sacrificing one digit, a Hollerith multiplier would also deal with 
the question of signs, and by the addition of a single relay would deal with all 8 x 8 multiplications 
and take account of the signs. 


Mr. H. L. Seal, after thanking Dr. Hartley for his paper, said that he echoed Dr. Fieller in 
disagreeing with the opening statement that statisticians are not necessarily good at arithmetic. 
In his opinion, unless a man had worn out at least one calculating machine he was not a true 
statistician ! He supposed that when Dr. Hartley described the Hollerith method of “ summaiy 
multiplication,” everyone realized that in fact the summation method of calculating an arithmetic 
average was l^ing applied. It was of historical interest to mention that this was the procedure 
first utilized by Tetens, a Danish actuarial mathematician, in 1785, when calculating the value of an 
annuity whose successive payments were the cardinal numbers i, 2, 3, . . . This was merely 
another illustration of the tact that throughout history efficient computers had again and again re- 
discovered useful dodges to help them in their work. 

It might be mentioned that the method of quadrature devised for use with a National machine 
seemed to be suitable, with possibly slight modifications, for a Hollerith Rolling Total Tabulator 
in much the same way as the National differencing method (described by Comrie in the Supplement) 
could be modified for use with a Hollerith Tabulator. 

In his reference to numerical quadrature the author had mentioned his application of the method 
described to the calculation of tables of the probability integral of the mean deviation of normal 
samples of n. The highly complicated formula used was derived by Godwin in Biometrika, 
XXXIII, 1945, 254, but the speaker wondered whether the work of the Italian actuary Tricomi 
would not have been preferable from the computational standpoint. Briefly, Tricomi had 
shown (Giornale deiristituto Italiano degli Attuari, VII, 1936, 280, and VIII, 1937, 68, 127) that (in 
Godwin’s notation) 


/«+i(/n) = (/! + 1) v; 


.m(l + 1/m) _ ^ 


and that the characteristic function of m for a sample of n is given by ^„(— it) where 

0 

Erd6lyi (/.r., VIII, 1937, 328) pointed out that A(w) could be obtained from ^Jit) by means of 
Doetsch's real inversion formula 


U0E(mUk)dt (2) 

i A 



1946] 


Discussion on Dr, Hartley's Paper 


179 


where 




and is a function whose numerical tabulation would prove of considerable value to statisticians. 
It would be interesting to ask Dr. Hartley to estimate by how much his calculations of the tables of 
fJim) would have been shortened by the use of (1) or (2). 


Dr. E. T. Goodwin said that the work which he had been organizing and carrying out had 
included few statistical calculations, and he could not add much of value to the discussion. He 
wished only to ask two questions. With regard to the application of the National machine 
mentioned on p. 168, could the author give some idea of the speed of this finite difference integration 
process ? Its time-saving nature was most apparent when the integral was required for the tabular 
points at which the integrand was tabulated. When the integral was required for only a few tabular 
points it might be preferable to employ some alternative method. He was thinking in particular 
of the method whereby the first sum of the function is formed and simultaneously differenced on 
the National machine, the values of the integral then being given by the usual central difference 
formula. 

He also desired to make a comment which Dr. Hartley might well regard as Joadish. On 
p. 158 he spoke of multiplication on an electric machine when the motor was controlled either by 
the operator through a motor-bar or by relays, and he cafied the first of these controls semi- 
automatic multiplication and the second fully-automatic multiplication. The former seemed to 
be, in essence, no more automatic than multiplication on a hand machine of the Brunsviga type, 
and he would prefer to reserve the term “ semi-automatic’* for the Marchant type of machine, 
where one strikes in succession the keys of a ten-key keyboard, as distinct from “ fully-automatic 
multiplication,” when multiplier and multiplicand are both set on keyboards and the operation of 
multiplication induced by striking a single key. 


The following contributions were received in writing after the meeting : 

Mr. J. L. Ineson : Dr. Hartley in Part II of his paper has given a selection of calculations 
involving the use of different types of machines. As a user of all of these types of machines, with 
the exception of the National accounting machine, I can well appreciate his difficulty in making a 
selection. During the past seven or eight years I have had the advantage of using an installation 
of Hollerith machines, including all the machines mentioned except the multiplier, for performing 
statistical calculations in connection with the loading of generating stations, and am convinced that 
a much greater use could be made of this type of machine in the solution of vastly differing kinds of 
problems. While it is true that punched-card equipment thrives on work involving very large 
numbers of cards and that, generally speaking, it is not a commercial proposition to install such 
complicated and expensive machinery to handle only small quantities of cards ; nevertheless, there 
must be many installations in this country where punched-card equipment has already been 
installed for other purposes and where, if the users were alive to its possibilities in dealing with 
other types of work, a considerable amount of brain-fag and heart-breaking work could be avoided. 
For this reason, if for no other, Dr. Hartley’s paper is particularly welcome, as it certainly does 
show what can be done with standard accounting machinery. 

Quite apart from this aspect of the matter, it appears to me that there are advantages to be 
gained from the use of punched-card equipment, in that most of the operations are to a great 
extent automatic, if not entirely so, and can, under supervision, be carried out by junior staff. If, 
therefore, sums of square and products, for example, are to be worked out, it may be advantageous 
in the long run to use punched cards, even though the number of multiplications to be performed 
is relatively small. 

As an example of the way in which much of the work becomes automatic when performed by 
the Hollerith equipment, consider the problem solved in Part II (i) of the paper by means of an 
adding and listing machine— namely, the calculation of moving totals and averages. Using 
Dr. Hartley’s illustration, the figures of Table V would be punched into one field of a card, together 
with information indicative of the number of the observations in the series. By means of the 
reproducing punch this same series would then be reproduced into a second field on the cards, 
starting seven cards later, so that the seventh card contains, for example, the figures 0*312, originally 
punched, and 0*361 now introduced from the first card, and so on. A single run of these cards 
through the tabulator would then automatically give the 19 moving totals or moving averages, or 
both, as may be desired, and of course these results could simultaneously be punched into a new 
set of cards. The process by which the machine produces the results is identical with that carried 
out on an adding and listing machine described in the paper. Plugging diagrams have actually 
been worked out to punch by direct tabulation 21, 35, 53 and 365 cycle sliding averages correct to 4 
significant figures, and there would seem to be little difficulty in arranging the plugging of the 
machine to give the averages over any other number of items. If sliding totals only are required, 
then the tabulating time is much shorter, but, apart from this, much laborious work is avoided if 

H2 



]80 


Discussion on Dr, Hartley's Paper [No. 2, 

the series is at all long, and, in any case, once the tabulator has been started no further attention 
is required, except to add further cards to the tabulator hopper and remove used and punched cards 
from time to time. Staff, can therefore be employed on other work almost throughout the time 
taken to work out the figures on an adding and listing machine and calculator. The real bugbea^ 
as with all Hollerith work, lies in the initial punching of the cards, and whether a Hollerith rnethod 
is used or not depends for the most part on whether the trouble and time taken in punching the 
cards originally are worth while when weighed against the other advantages resulting from the use 
of the system. Often much depends on whether the cards have already been punched for another 
purpose or could be used for other purposes if they were punched, and a decision on which method 
to use is dependent on local circumstances. 

There is one function of the Hollerith machines which Dr. Hartley has not mentioned in his 
paper, and which, to me at any rate, is one of its most important functions. This is the capacity 
of the tabulator to select one of a number of possible alternative answers to a calculation according 
to certain predetermined conditions communicated to the machines via the plug-boards. To give 
an example. The corresponding values of four variables — Uu and y — are given over a period 
of time, the number of sets of values normally being considered at one time being of the order of 
18,000. It is desired to calculate for each set. the value of S determined by the following conditions, 
and to find the sum of the values of S over the whole range of values. The conditions arc : 

y - ^ <0 then S ^ y U.- y > 0 

“ 0 otherwise 

A > 0 then 5 — f/i f Ut — x if t/i + t/a — -v > 0 
-- 0 otherwise 

Thus the machine has to calculate -I- — x), {U2 — y) and (Ui + U2 — x), examine the 

signs of each term, select the required answer, print or punch it, and transfer it to a counter where 
it can be stored up to give the required total. This is done merely by punching one card for each 
set of variables with the four values of these variables and passing the cards into the tabulator. 
The tabulator runs and produces the required results without any attention save the feeding and 
removal of cards. As a matter of interest, S is the minimum output of an inefficient generating 
station situated at the end of a circuit of edacity y which connects it to a more efficient generating 
station having available generating plant of capacity x, the requirements which have to be supplied 
at each end of the line being respectively U2 and Uv 

There is just one other point which T think should be emphasized more fully, and that is the 
necessity for providing as many checks as possible on machine work. It is true that attention has 
been drawn in the paper to the checks which may be applied, but in my view — and 1 have no doubt 
that Dr. Hartley will agree — these checks are not optional, but obligatory. The fact that an 
answer to a problem is machine-made is in itself no guarantee of the accuracy of that answer, and 
even if it involves additional work, as in the example of the 5-check, it is imperative that such a 
check be made. 

In conclusion, I should like to record my pleasure at being allowed a preview of Dr. Hartley’s 
paper and at having had an opportunity of joining in the discussion, if only at second hand. The 
literature on this subject published in this country is painfully sparse, and on that account, as well 
as on account of the excellence of the paper itself, my only regret is that the paper was not longer. 
I do sincerely hope, however, that it wiU be quickly followed by other papers on this or related 
subjects. 


Dr. a. D. Booth: It was with considerable pleasure that I received an advance copy of Dr. 
Hartley’s paper, and my chief feeling, after reading it, was one of regret at my inability to be present 
at the discussion. 

I am sure that everyone present is full of admiration for the virtuosity with which commercially 
available machines have been adapted to uses completely foreign to those for which they were 
originally designed. I would, however, like to express the views of a confirmed “ gadgeteer.” As 
Dr. Hartley knows, I tend to the school of thought which considers that, when a sufficiently large 
number of any one type of calculation is to be made, a specially designed machine is always justified. 

After considerable experience with calculations of the type 

p(x, z) = ^ S ^ F(/t, k. I) cosln (h^ + + 


where the summation extends over all values of (/r, A7) in integer steps, positive and negative, and 



are numbers given to three or four decimal places), I have come to the conclusion that the 


point at which standard equipment breaks down is in the taking of the cosine. Although by a 
piece of sheer virtuosity Dr. Hartley and his colleagues have used Hollerith equipment for this 
purpose, 1 still have the feeling that it is rather a case of “ steamroller and nut.” 1 have recently 
designed, and have now under construction, a machine for this purpose. Its chief virtue lies in its 



1946] Discussion on Dr. Hartley's Paper 181 

using only standard telephone equipment; and at a conservative estimate it should effect a saving 
of time by a factor of 5 . In addition to the gain in speed, the new machine will be comparatively 
inexpensive {ca. £ 50 , excluding labour), and when this total cost is compared with the cost of hire 
of Hollerith equipment the advantage is apparent. The relay machine has an exact electronic 
analogue and, although this would be rather more expensive, it would be possible to form the 
terms of the above summation at the rate of about 1000 per second. 

Whilst this description applies only to the type of calculation specified, it is fairly obvious that 
the machine could be applied to any problem where it is necessary to enter mathematical tables 
and, in these cases, the advantages of an ad hoc machine would be equally great. 

In conclusion, may 1 thank Dr. Hartley for a most interesting paper, and for “exposing’* 
several of the tricks which must have been as much a mystery to other non-professional computers 
as they were to myself. 

Dr. H. O. Hartley said that perhaps he might be allowed to reply briefly to some of the points 
now, leaving further details to the reply in writing. 

Jn the first place, he wished to thank all speakers for the constructive suggestions and criticisms 
made. 

A number of questions had been asked about his method of quadrature on the “ National ’’ 
machine. Mr. Sadler asked why he had not used the formula giving the integral at arguments 
half-way between the tabular arguments of the integrand function. He agreed that this formula 
was better convergent, in fact he had used it in his original attempts at mechanized quadrature. 
However, experience had shown that the necessity for changing the interval of quadrature frequently 
arises. When narrowing down the interval of integration with the “ half-way-point “ formula it 
is often necessary to find a new starting value of the integral by interpolating between the values 
in the preceding wide interval panel. Moreover, when tabulating an integral, the interval must 
normally be taken sufficiently fine to make the final table linear, and there is no gain in using a 
wide-interval formula in the first place. Indeed, the requirement is usually for a series of quadra- 
tures, each to be taken over a short range, with the National method worth while only at a fine 
interval. It was because of these reasons that the method described in the paper had actually 
proved more useful in practice, although Mr. Sadler's formula had most of the theoretical aef- 
vantages. Sometimes, with the range of quadrature lon^, one could use a method of quadrature 
with simultaneous sub-tabulation of the integral for which he had worked out details based on 
the half-way-point formula. 

The question was closely linked with Dr. Goodwin’s question whether the method should be 
used for the evaluation of an integral at isolated points or for a definite integral. For the latter 
the answer was in the negative, and the ordinary method of differencing with simultaneous sum- 
mation of integrand values mentioned by Dr. Goodwin was faster, in the former case the decision 
depended on details. 

In answer to Mr. Fieller, further terms could be added to the formula of quadrature, but the 
necessity seldom arose. 

A number of speakers had taken exception to his remark “ some statisticians loathe figures.” 
He had been thinking of industrial statisticians who often preferred the graphical method and 
approach. 

The Chairman and Dr. Hudson felt that more should have been given about the analysis of 
variance and more importance attached to flexibility of mechanized analysis rather than speed. 
The trouble with mechanizing small analysis of a varying character was that the overhead time, 
of setting the mechanism up becomes such a large proportion of the useful working time, and he 
would refer these speakers to his introduction. Mr. Kempthorne at Rothamsted had worked 
out details for doing 2 ^ analysis on a Hollerith tabulator, but he understood that the method was 
worth while only if quite a number of such experiments had to be analysed simultaneously. To 
the best of his knowledge there was no Hollerith tabulator specially adapted to analysis of variance 
work, the reason, no doubt, being that requirements were too scattered and varied. 

He was interested in Mr. Seal’s references to Italian work on the mean deviation, and would 
certainly follow up these references. His own contribution to the work referred to was confined 
to evaluating Mr. Godwin’s formula numerically. He agreed that this had been quite a task, 
but it had been a case of “ theirs not to reason why, theirs but to do and die.’’ 

Mr. Todd had raised the question as to general rules when Hollerith equipment should be used 
in place of desk calculators. No general rules could be given, as the decision would depend on a 
multitude of details of organization. As a very rough-and-ready rule the department concerned 
should first compare the existing method with the possibilities offered by Hollerith (using expert 
advice). Unless the Hollerith equipment would do the job in something like one-third of the 
original time it would not be worth while pursuing the matter further unless the time factor was 
all-important. Further, it should be the aim of the department to keep the Hollerith installation 
busy all the time. It would be seen therefore that the incitement would come from an increase 
in the computing programme of a department or from a reduction in staff. 

Finally he would like to thank all speakers for the sympathetic reception his paper had received. 



182 Discussion on Dr. Hartley's Paper [No. 2» 

and in particular the Chairman, Dr. Comri^, Mr. Hey and Mr. Mandeville for the kind personal 
references. 

Dr. Hartley subs^uently wrote as follows : * 

I have already replied to some of the points at the meeting, and will therefore confine myself 
to dealing with the outstanding points and with the contributions in writing:— 

Mr. Todd seemed to feel that in the scheme for solving simultaneous equations the capabiUties 
of the punched-card system had not been fully exploited, and that other methods should be tried. 
I would say that Matrix Multiplication has been tried on Hollerith with rather less success, and 
that a machine specially built for scientific work (the Automatic Sequence Controlled Calculator, 
of which details have just been published in a Manual) also uses the Gauss-Doolittle elimination 
process with speeds not unlike those achieved here on ordinary commercial models very much 
smaller than the “ American Monster.” The elimination process has, in fact, been chosen because 
there are in it certain features suited to the punched-card system. Iteration, if known to converge, 
is actually better suited to using punched-card equipment, but, more often than not, conditions 
guaranteeing its convergence are not satisfied. 

Dr. Comrie’s principle of investigating the possibilities of existing machines before building 
a special one 1 regard as one of the most convincing lessons that I learned from him. 

Mr. Hey raises a number of technical points : The coloured X, Y, space cards have given every 
satisfaction for single groups of as few as so cards working, sometimes, with duplicate packs (where 
N is small k has to be large for the Hollerith method to be worth while). In the case of ^veral 
groups quoted by Mr. Hey his space cards alone would increase sorting and tabulating time by 
about 30 per cent., and I am not convinced that sorting of separate groups should add more delay. 
Also, it may not be possible to adjust all answers by standard amounts if the size of the fields vary 
and counter capacity is restricted. 

Mr. Hey’s suggested scheme of producing moving averages necessitates recopying the time 
series in a 6 x NJt pattern — also, when the data consist of four-figure numbers or even more, 
people (like myself) somewhat weak at mental arithmetic might find the mental forming of differences 
tiring. 

I am not convinced that punching 108 cards, each in one two-figure field, with a run of 72 
cards through the Reproducer saved, should be more efficient than the punching of three (72- 
column) cards. I estimate the theoretical saving as about i minute Reproducer time set against 
a loss of about 7 minutes punch operators’ time. Mr. Hey’s suggestion was deliberately discarded 
in favour of the scheme given in the paper. 

The scheme for getting multiples of 10, 20, 40, 80 and —160 for all counter contents does not 
require any special re-winng, but necessitates breaking control five times with space cards. 

I agree that the scheme for checking should be made an inherent part of the computing scheme, 
but the former is more dependent on the efficiency of the computing staff, and generalizations are 
therefore more difficult. 

Mr. Boss has kindly mentioned the difficulties of carrying out Hollerith work on ” borrowed ” 
machines at ” inaccessible ” places. It was difficulties of this kind that Miss Gittus of the Scientific 
Computing Service had to face when she brought the experiment of solving 28 equations to a 
completely successful conclusion. It is due to her perseverance and patience that, since the paper 
was written, the method has been found to work in practice. 

Mr. Seal suggested that the tabulation of the Mean Deviation tables could have been shortened 
by using formuTae derived by the Italian actuary Tricomi. I find, however (unless my limited 
knowledge of Italian deceives me), that Tricomi is concerned with the mean deviation about the 
known population mean (assumed to be o) whilst Godwin’s formula gives the distribution of the 
mean deviation about the sample mean which, naturally, turns out to be more complicated. 

In reply to Dr. Goodwin, the sp^d of the method of quadrature on the National varies between 
100 to 400 values per hour according to the length of runs, size and number of input differences 
and efficiency of operator. This time does not include any of the preliminary operations (e.g., 
differencing). The usage of terms “fully automatic” and “semi-automatic” follows long- 
standing trade literature practice. 

1 was most interested in Mr. Ineson’s contribution. I agree that Hollerith work is more 
automatic than calculations carried out on desk calculators, and that the actual operating can be 
performed by junior staff. However, the testing of plugging, elimination of faults, control and 
final checks arc skilful jobs, and require staff with good qualifications and the right temperament. 

1 can visualize Mr. Ineson’s method for calculating moving totals on Hollerith: as he says, 
it is a question of whether the punching of the cards is worth while. I imagine that his method 
of conversion into averages is multiplication of totals by the reciprocal (1/6 0*1667) hy succes- 

sive rolling, and that it will take a large number of cycles to achieve this (I am counting that six 
cycles would be required). It might be better to summary punch moving totals, and to convert 
into averages on the Reproducer, using a master-card table (see Part I (v)), particularly if the series 
moving totals cover a limited range and, say, a month’s output of moving totals can be “ averaged ” 
simultaneously. 



1946] Discussion on Dr, Hartley's Paper 183 

The feature of distribution on a Hollerith tabulator has perhaps been wrongly neglected in 
the paper. Mr. Mandeville has, somehow, made up for this by giving his example of weighted 
averaging. Mr. Ineson has given another equally ingenious example involving the automatic 
sensing of the sign of a counter content. To make the plug-charts for either of these examples 
would, I am sure, have made Mr. Key’s heart jump with joy ! But I think we are all agreed that 
when distribution is extensively used in a scheme there is, by its very definition, a large consumption 
of counter capacity. Such schemes are therefore excellent where counter capacity is available, 
but where not, distribution may seriously reduce output. 

I was interested to hear about Dr. Booth’s planned machine for doing Fourier synthesis. It 
is, however, not for me to comment on his comparison with the Hollerith method ; firstly, because 
I was not associated with this development (which was carried out by Mr. Hey, and has been in 
successful use for three-dimensional synthesis, a calculation previously shirked), secondly, because 
Dr. Booth does not state under which conditions his quoted figures of cost and sp^d are applicable, 
and thirdly, because it is somewhat premature to compare the speed of a working machine with 
that of a planned construction. 1 would, however, remind Dr. Booth that in the case of serial 
correlation an assembly of Post Office relays resulted in a machine slightly slower than the method 
on a somewhat larger standard Hollerith installation. Here, as with Fourier synthesis, all depends 
on the scale of the work. 



184 


[No. 2, 


The “Effective” Number of lNDEt»ENDENT Observations in an Autocorrelated 

Time Series 

By G. V. Bayley and J. M. Hammersley 

Introduction 

In the Symposium on Auto-Correlation in Time Series reported in the preceding part of this 
Journal, we made some reference (pp. 91-93) to our own work on the subject. This work was 
embodied in reports not generally available. The object of the present paper is to give in 
condensed form some of the results obtained. The paper should be read in conjunction with our 
own remarks and those of the other contributors to the Symposium. 


Theoretical treatment 

1 . For a set of /2 independent readings of variance we may write * 


var (x) — CT *//2 (1) 

var (WaO = 2(j*/n (2) 

var (s*) = 2aV(/i - 1) (3) 


- 1 »» 

^ “ D Xi, 

n . 




Formulx (1), (2), (3) will not hold if there is autocorrelation between the successive readings 
Afi, Xj, . . . x„. For such a case however, we may define numbers n**, «,*, and n* (which may be 
considered in a certain sense to be the “ effective number of independent observations”) so that 

var (x) — (4) 

var (wa') = Ivdln^* (5) 

var(s>) 2o‘/(/?.* - 1) (6) 

In definitions (4), (5), (6), x, mi, and s* are defined to be unbiased estimates, scil. 

E(x) = £(x), £(ma') = W) - £{x - £(x)}* = e* (7) 


It follows that 


- 1 Z 

X = - i X, 


m^'=l ^ {Xj-E(x)V 


(10) 

2. If T is the time interval between successive observations, so that o^pO’t) = E(xaXa+]) then 
var ix) f 


H ~ 1 

var(ma') =" — + S (« - J)pHr) 
n n j 1 


. ( 12 ) 


. (13) 


var - 2e*{/i«( n - 1) - 4/.S . + 2n2:. - SS. - - 8/iS,} 

l)*_4«(rt- l)2i +2Sa- ‘22, +'824+825 • ’ ’ 

* In equations (2) and (3) it is assumed that /.« == 3o*, i.e., the distribution of the readings is mesokurtic. 



Independent Observations in an Autocorrelated Time Series 


185 


19461 

where 


Lj = \/i -y)pO*T), Sa - (n -J)W + 2/1 - 4y)p*0‘T). 

; - 1 j - 1 

S, = (n -y)(/i* - 2yV0T), - y)A:p(yT)p(^T), 

» - 1 y - 1 |;|(„ _ i)j 

^ S («—;•)(« -A:)p(/T)p(ArT), S, = S (n ~ 2i)pHJr), 

— 2/5 — 1 ;«1 

s, = “s' S («_y-A:)pO-ir)p(^T), 

j «• 2 A « 1 


where the symbol [}(// — 1)] denotes the largest possible integer not greater than K'* l)f and K 
is equal to /2 — y — 1 ory — 1 whichever is the less. 

3. Alternatively we may express n* in terms of n and p as follows : — 





(14) 


112 

-Ai. = -4-4 s (/i~yV(yT) 


(15) 


JL _ /?H/» - 1 ) - 4- - 8S , - 4/?S e - 8/iS , 

n„* nHn — 1) — 4/i*2i 4- 2 S 2 4- SSg — 4/i2e — S/jS, 


4. The above formulae (14), (1 5), ( 1 6) reduce to those given by Bartlett if certain approximations 
are made — e.g.^ by writing n for 2(n — y) in (14) we obtain 


Wj* - «/ S p(yT) 

; = « 

and (15) can be reduced in a similar manner; or if in (16) we omit S*, iig, Sg, and Sy and write 
n^ for 2(/j* — y)(/i* + 2/i — 4y) in Sj* and /i® for 2(/i — y)(/j* — 2y) in Xg 


r 1 'I 

1 >: p*(y^) ^ 

l+~f,5PW 


5. Before proceeding further it is appropriate to note the conditions for which relations (8) to 
(16) inclusive are valid. (8), (9), (10), (11), and (14) are perfectly general and hold whatever the 
distribution of the jc’s. (12), (13), (15), and (16) are obtained on the assumption that 

E(XrX,x^x,) == p{(r - .y)T}p{(« ~ v)t} -f p{(v ~ //)t}p{(j - v)t} 4- ?{(/* - v)T}p{(j - «)t} 

which is true if the x's are defined by a linear process whose random impulse function has a meso- 
kurtic distribution. As a particular case they are true when the x"% are normally distributed. It 
is further assumed in (13) and (16) that the parametric values of the p’s are known; so that given 
/?, /ig* is known from (14). The evaluation of varCs*) when allowance has to be made for the 
sampling errors of /r^* presents a problem much harder than that discussed here. 

6. Returning now to the general argument of the problem, let us suppose that our recordings 
are evenly spaced in time along a continuous process extending for a time T Then 

n^Th (17) 

and the labour of computation will be directly proportional to n. We may define the efficiency of 
the computation by the percentage ratios 

E, = 100/j,*//i, or E^ - 100/7/ //I, or E, - IOO/ 1///7 (18) 

as the case may be. Thus by inserting trial values of t in (17) and calculating Ei, or £„ we can 
assess the value of t most suited to our purposes. 



186 


Bayley and Hammersuy — The “ Effective " Nwnber of [No. 2, , 

I 

Table I 


Time 
interval t 

No. of actual 
observations 

Effective number of observations 

Efficiency, % 

/»** 

nd* 

nt* 

Eh 

Ed 

Ev 

0 

oo 

150 

66 

65 

0 

0 

0 

1 

150 

145 

63 

62 

96 

42 

41 

2 

75 

128 

56 

55 

171 

75 

74 

3 

50 

103 

43 

43 

205 

87 

86 

4 

38 

78 

31 

31 

204 

82 

82 

5 

30 

53 

25 

25 

177 

82 

82 

6 

25 

35 

23 

23 

141 

93 

93 

7 

22 

24 

22 

22 

108 

99 

99 

8 

19 

17 

19 

19 

89 

99 

98 

9 

17 

14 

17 

17 

82 

97 

97 

10 

15 

13 

15 

15 

86 

98 

98 


Practical application 

7. Equations (14), (15), (16) arc difficult to apply as they stand in practice, because of the labour 
required to evaluate them successively for a num^r of different trial values of t. We modify them 
accordingly by assuming the functional form of the autocorrelation is 

p(5) = cos Xy (19) 

where X and jx are constants peculiar to the autoregressive process under consideration. On sub- 
stituting (19) in (14), (15), and (16), and neglecting 0(exp — (jl/it) after the various summations have 
been effected, we obtain 

J _ 1 f sinh {XT \ in — cosh {XT cos Xt “1 

h Icosh (XT ~ cos XtJ n* t(cbsh (xt — cos (xt)* j 

1 _ 1 / sinh2(XT , . 1 

n^* lit Icosh 2|xt — cos 2Xt 

4- _L / ~ ^ _ cosh 2(xt cos 2Xt — 1 1 . 

^ 2/1* 1 cosh 2(xt — 1 (cosh 2(xt - cos 2Xt)*/ ’ ' 

J_ 1, / sinh 2(xt , 1 , X f sinh 2{xt — 1 __ sinh (xt — cos Xt 

2/1 \cosh 2|xt — cos 2Xt ' J ^ /i* [cosh 2(xt 1 cosh (xt — cos Xt 

— i( cos 2Xt - f l)(cos 2Xt —• 9) -f ^(cosh 2(xt 4- 1) (3 co s 2Xt f 1) — 2 cosh (xt sin Xt sin 2Xt'1 

(cosh 2(xt — cos 2Xt)* j 

4-0(«'*) . . (22) 


For purposes of computation we write these equations as 

«.* = uMP, ^)/{ 1 + Z))} (23) 

«.* = «/.(F, 7))/{l - Z))| (24) 

/».* = nUF, 0)/{ 1 - J Z))} (25) 

where F == Xt/2tv, D ^ \lx (26) 


and/*,//,/,,/rf',// are functions which we have tabulated in Tables II to VI. 

8. In practice we first estimate the quantities p and h from the sample correlogram, where p is 
the average period (in seconds) of the correlogram, and h is the average decrement per period of 
the autocorrelation coefficient which can be found by taking the average ratio r{s + p)lr(s) for 
a number of values of 5. Effect can be given to the greater reliability of r(s) when s is small by 
weighting the ratios empirically according to the value of jr. It is clear that 


F t/p (27) 

/) = — F. log A (28) 










187 


\ 1946] Independent Observations in an Autocorrelated Time Series 

Table II 

Effective number of independent observations (%) for bias (n large) 
100/i(F. D) 


“ 0, or any integer. 4- 


D 

•00 

100 

•05 

•95 

•10 

•90 

•20 

•80 


•40 

60 

•50 

(L) Z>->0 

50 X D 

4-89/Z) 

191/D 

691/D 

130-9/D 

180-9/£> 

200 0/D 

01 

5 

54 

196 

695 

1311 

1810 

2001 

0*2 

10 

34 

104 

353 

660 

909 

1004 

0-3 

15 

31 

78 

242 

445 

609 

672 

0-4 

20 

32 

66 

188 

338 

460 

507 

0-5 

24 

34 

61 

157 

276 

372 

408 

0-6 

29 

37 

59 

138 

235 

313 

343 

0*7 

34 

40 

59 

125 

206 

272 

297 

0-8 

38 

44 

59 

116 

185 

242 

263 

0*9 

42 

47 

61 

no 

170 

218 

237 

10 

46 

50 

62 

105 

158 

200 

216 

M 

50 

54 

64 

102 

148 

186 

200 

1-2 

54 

57 

66 

100 

140 

174 

186 

1*3 

57 1 

60 

68 

98 

134 

164 

175 

1-4 

60 

63 

70 

97 

129 

155 

165 

1-5 

64 

66 

72 

96 

125 

148 

157 

1-6 

66 

69 

74 

95 

122 

143 

151 

1*7 

69 

71 

76 

95 

119 

137 

145 

1-8 

72 

73 

78 

95 

116 

133 

140 

1-9 

74 

76 

80 

95 

114 

. 129 

135 

20 

76 

78 

81 

95 

112 

126 

131 

30 

91 

91 

92 

97 

104 

109 

no 

40 

96 

97 

97 

99 

101 

103 

104 

50 

99 

99 

99 

100 

100 

101 

101 

60 

99 

100 j 

100 

100 

100 

100 

100 


9. The limiting forms of (23), (24), (25) as t — > 0 are of interest in that they give the maximum 
amount of information that can be extracted from a process of duration T, We obtain in this 


limiting case : — 

«»*-J/r(A)/{l +^>4'(/i)} (29) 

(30) 

«.*-j5(A)/{l -|,B/(/r)} (31) 

where 


Aih) - (471* -t- A'{h) = (477* - H*)im4T^ + //*) ) 

B(h) - /f(4K* + /f*)/(27c* + Af*). BAh) = (877* + 277*/A* + + AA*)(477* + //*) [ (32) 

B.'ih) - (1677* - 2277*//* - 4- AA*)(477» + A/»), H= - log h. J 

A, A', B, BA B,' and Haro Ubulated in Table 7. 

10. The case of the Markoff process can be deduced from the foregoing results by taking 
X = 0 (i.c., F=0 also). In the limiting case we have in place of (29), (30), (31) 

/7.* = ’^r/(i-^) (33) 

(34) 




( 35 ) 





188 


Baylev and Hammersley — The “ Effective ” Number of 


[No. 2, 


Table III 
ff(F, D) 

F « 0, or any integer, + 


D 

•00 

100 

•01 

•99 

•02 

•98 

•03 

•97 

•04 

•96 

•05 

•95 

•10 

•90 

•20 

■80 

•30 

•70 

•40 

•60 

•50 

001 



100 0 i 

+951 

+97-5 

+99-2 

+99-7 

+99-8 

+ 100 0 

+ 1000 

+ 1000 

+ 100 0 

+ 1000 

002 1 


500 

+40-8 

+47-5 

+48-6 

+49-4 

+49-6 

+ 

500 

+ 

500 

+ 

500 

+ 

500 

+ 500 

003 

— 

33-3 

+ 2\0 

+29-8 

+31-6 

+ 32-5 

+32-8 

+ 

33-2 

+ 

33-3 

+ 

33-3 

+ 

33*3 

+ 33-3 

004 


250 

+ 10-6 i 

+20-3 

+22-9 

+23-8 

+24-2 

+ 

24-8 

+ 

250 

+ 

250 

+ 

250 

+ 25*0 

005 

— 

200 

+ 

4*5 

+ 14-5 

+ 174 

+ 18-5 

+ 19*0 

+ 

19-8 

+ 200 

+ 

20-0 

+ 

200 

+ 200 

006 

— 

16-7 

+ 

0*8 

+ 10-5 

+ 13-6 

+ 14-9 1 

+ 15-6 

+ 

164 

+ 

16-6 

+ 

16-6 

+ 

16-7 

+ 

16-7 

007 


14*3 


1*5 

+ 

7-6 

+ 10-9 

+ 12-2 ! 

+ 130 

+ 

140 

+ 

14-2 

+ 

14-3 

+ 

14-3 

+ 

14-3 

008 

— 

12-5 1 

— 

3*0 

+ 

5*3 

+ 

8-7 

+ 10-2 

+ 110 

+ 

121 

+ 

124 

+ 

12-5 

+ 

12-5 

+ 

12*5 

009 

— 

IM 

_ 

3-8 

+ 

3-6 

+ 

70 

+ 

8-6 

+ 9-4 

+ 

10-7 

+ 

no 

+ 

111 

+ 

111 

+ 

IM 

010 

— 

100 


4-3 

+ 

2*2 

+ 

5-6 

+ 

7-3 

+ 8-2 

+ 

9-5 


9.9 

+ 

100 

+ 

100 

j- 

100 

on 

— 

91 

— 

4*6 

+ 

1-2 

+ 

4-5 

+ 

6-2 

+ 7-1 


8-6 

+ 

90 

+ 

90 

+ 

91 

+ 

91 

012 

— 

8-3 


4*7 

+ 

0-4 

-h 

3-5 

+ 

5-3 

+ 6-2 

+ 

7-8 

+ 

8-2 

+ 

8-3 

+ 

8-3 

+ 

8-3 

013 

— 

7*7 

— 

4-7 


0-2 

+ 

2-8 

+ 

4-5 

+ 5*5 

+ 

71 

+ 

7-5 

+ 

7-6 

+ 

7-7 

+ 

7-7 

014 

— 

71 

— 

4*7 

— 

0-7 

+ 

21 

+ 

3-8 

+ 4-8 

+ 

0*5 

+ 

70 

+ 

71 

+ 

71 

+ 

71 

015 


6*6 

— 

4*6 

__ 

11 

+ 

1-5 

+ 

3*2 

+ 4-2 

+ 

60 

+ 

65 

+ 

6-6 

+ 

6*6 

■+ 

6-6 

016 

— 

6-2 


4*5 

— 

14 

+ 

10 

+ 

2-7 

+ 3-7 

+ 

5-5 

+ 

61 

H- 

6-2 

+ 

6-2 

+ 

6*2 

017 

— 

5-9 


4.4 


1*7 

+ 

0-6 

+ 

2-2 

+ 3*2 

+ 

51 

+ 

5-7 

+ 

5-8 

4- 

5-8 

+ 

5-9 

018 

_ 

5-5 

— 

4*3 

— 

1-9 

+ 

0-3 

+ 

1-8 

+ 2-8 

+ 

4-7 

+ 

54 

+ 

5-5 

+ 

5-5 

+ 

5-5 

019 

_ 

5*2 


42 

— 

2 0 

— 

00 

+ 

1-5 

+ 2-5 

+ 

4.4 

+ 

51 

4- 

5-2 

+ 

5-2 

4- 

5-2 

0-20 


5*0 


4-1 

— 

21 

*- 

0-3 

+ 

11 

21 

+ 

41 

+ 

48 

+ 

4.9 

+ 

50 

+ 

50 

0-30 

_ 

3-3 

_ 

3-0 

_ 

2-3 


14 



0-5 

+ 0-2 

+ 

2-1 

+ 

30 

+ 

3-2 

+ 

3-3 

+ 

3-3 

0-40 


24 


2-3 


20 

— 

1-5 

— 

10 

- 0-5 

+ 

11 

+ 

21 

+ 

2-3 

+ 

24 

4- 

24 

0-50 

— 

1-9 

~ 

1*9 

— 

1-7 

— 

14 


1 1 

- 0*8 

+ 

0-5 

+ 

1-5 


1-8 

+ 

1-9 

+ 

1-9 

0-60 

— 

1*6 

— 

1-5 

— 

14 

— 

1-3 

— 

11 

- 0-9 

+ 

0-2 

+ 

11 

•1“ 

14 

+ 

1-5 

+ 

1*6 

0*70 

— 

1-3 

— 

1-3 

- 

1*2 


11 

— 

10 

- 0-8 

— 

00 

+ 

0-9 

+ 

1*2 

-1- 

1-3 

+ 

1-3 

0-80 


11 

— 

1*1 


M 

— 

10 

— 

0-9 

- 0-8 

— 

0-2 

+ 

0*6 

+ 

10 

+ 

M 

+ 

11 

0-90 

— 

10 

— 

09 

— 

0-9 

— 

0-9 

- 

0-8 

- 0-7 

~ 

0-2 

+ 

05 

"h 

0-8 

+ 

09 

+ 

10 

1 00 

— 

0*9 


0-8 

““ 

0-8 

— 

0-8 


0-7 

- 0*7 


0-3 

+ 

04 

+ 

0-7 

+ 

0-8 

+ 

0-9 

200 

— 

0-3 

— 

0-3 

— 

0-3 

— 

0-3 

— 

0-3 

-- 0-3 

— 

0-2 



00 

+ 

01 

+ 

0-2 

+ 

0-3 

300 


01 

— 

01 

— 

01 


01 

— 

01 

- 01 

— 

01 

_ 

00 

+ 

00 

+ 

0-1 

4- 

01 

400 

— 

00 

— 

00 

— 

00 

— 

00 

— 

00 

- 0 0 

— 

00 

— 

00 

+ 

00 

4 

00 

+ 

00 


, . ti'.MF, D) 

w» 1 — — 

1 -i--U(F,D) 


1 1 . We can extend the treatment to the case where a random component has been superimposed 
on the autoregressive process, for which the autocorrelation function will then be 


p(s) = 1, (s = 0) 

p(s) = cos Xs, (s + 0) 


(36) 


by first finding the values /i 4 *(l), «a*(l), n*(\) which correspond to « = 1, using X and jj. appro- 
priate to equation (36), and then applying the equations : — 


1 _ 

_ 1 

-1 J 

r 1 

-11 


n 


+»,*(!) " 

nj 

1 

^ 1 


f 1 

11 


n 

+ a*'j 

l«a*a) ■ 

'nJ 

1 

1 

n 

+ a*\ 

r 1 

l «.*(!) " 

■;} 


am-a)(,r 1 

rlfiF, D) 


- '] + " + 



(37) 

(38) 

(39) 


Typical cases of n* and E 

12. We may consider firstly the case where the time series is very long and where n* may be 
taken as equal to nf(F, D). For a scries T seconds long let P = Tjp. Also for practical purposes 
T = Tjn, /.<?., F = t//? -- Pin, D = FlI = HPIn. 



1946] 


189 


Independent Observations in ah Autocorreiated Time Series 
Table IV 

Effective number of independent observations (%) for dispersion (n large) 

lOO/XF, D) 


r == u» or any integer, + 


D 

•00 

•50 

100 

•01 

•49 

•51 

•99 

•02 

•48 

•52 

•98 

•03 

■47 

•53 

•97 

•04 

•46 

•54 

•96 

•05 

•45 

■55 

•95 

•10 

•40 

■60 

■90 

•15 1 

•35 
•65 
•85 

oooo 

•25 

•75 

001 

10 

20 

20 

20 

20 

20 

20 

20 

20 

20 

002 

20 

3-7 

3-9 

40 

40 

40 

40 

40 

40 

40 

003 

30 

51 

5-7 

5-9 

5-9 

5-9 

60 

60 

60 

60 

004 

40 

6-2 

7-3 

7-7 

7*8 

7-9 

80 

80 

80 

80 

0*05 

50 

7-2 

8-8 

9-4 

9-6 

9-7 

9-9 

100 

100 

100 

006 

60 

81 

101 

no 

11-4 

n -6 

11-9 

11-9 

11-9 

11-9 

007 

70 

90 

11-3 

12-4 

130 

13-3 

13-8 

13-9 

13-9 

13-9 

008 1 

80 

9-9 

12-4 

13-8 

14-6 

150 

15-7 

15-8 

15-8 

15-9 

009 

90 

10-7 

13-4 

151 

161 

16-6 

17-6 

17-7 

17-8 

17-8 

010 

100 

11-6 

14*3 

16-3 

17-5 

18-2 

194 

19-6 

19-7 

19-7 

0*11 

no 

12-5 

15*3 

17*4 

18-8 

19-7 

21-2 

21-5 

21-6 

21*7 

012 

11*9 

13-4 

161 

18-5 1 

20- 1 

211 

230 

23-4 

23*5 

23-6 

013 

12-9 

14-3 

170 

19-5 

21-3 

22‘5 

24-7 

25-2 

25-4 

25-4 

014 

13-9 

15-2 

17*9 

20*5 

22-4 

23-8 

26*4 

270 

27-2 

27-3 

015 

14-9 

161 

18-7 

21-4 

23-5 

250 

280 

288 

29- 1 

29- 1 

016 

15-9 

170 

19*6 

22-3 

24-5 ! 

26-2 

29-7 

30-6 

30-9 

310 

017 

16-8 

17*9 

20-4 

23*2 

25-5 

27-3 

31-2 

. 32-3 

32-7 

32-7 

0*18 

17-8 

18*8 

21-3 

240 

26-5 

28-4 

32-8 

33-4 

34-4 

34-5 

019 

18-8 

19*7 

22*1 

24*9 

27-4 

29*4 

34-3 

35-7 

361 

36-3 

0*20 

19-7 

20-7 

22-9 

25*7 

28-3 ; 

30-4 

35-7 

37-3 

37-9 

380 

0.30 

29- 1 

29-7 

31*4 

33-8 

36-4 

390 

48 * 1 

51-8 

53-3 

53*7 

0-40 

380 

38-4 

39*7 

41-6 

43-9 

46-4 

57-2 

631 

65-7 

66-4 

0*50 

46-2 

46-5 

47-5 

490 

50-9 

531 

64*2 

71-5 

75-1 

76-2 

0*60 

53*7 

540 

54*7 

55*9 

57-5 

59-4 

69-8 

77-7 

820 

83-4 

0-70 

60-4 

606 

61*2 

62-2 

63-5 

65- 1 

74-4 

82-3 

870 

88-5 

0-80 

66-4 

66-6 

67- 1 

67-8 

68-9 

70-2 

78-3 

85-9 

90-6 

92-2 

0-90 

71-6 

71-8 

72-2 

72-8 

73-7 

74-7 

81-7 

88-6 

931 

94-7 

100 

76-2 

76-3 

76-6 

77-1 

77-8 

78-7 

84-6 

90-7 

94-9 

96-4 

200 

96-4 

96-4 

96-5 

96-5 

96-6 

96-7 

97-7 

98-7 

99-6 

99-9 

300 

99-5 

99-5 

99-5 

99-5 

99-5 

99-6 

991 

998 

99-9 

1000 

400 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1000 

1 

1000 

1000 


We may therefore write : — 

nt* - nUF, D) - n<i>MP, h); /,e., n,*IP - 0,(/2/F, h). 

^ nMF, D) = nUn/P h) ; />., n,*IP - n,*IP - 0,(/,/P, h). 

Similarly F, are functions of F and h whatever the value of F. We have calculated and 

graphed the expressions a 76 */F, n^jP = /Jrf*/F, F* and F„ = F^ for the three values of h = 0*2, 
0*5 and 0*8. TTiese values roughly cover the range of h often found in practice. /i*/F is shown 
as a function of njP in Figures 1 and 2. F is shown as a function of F in Figures 3 and 4. 

The gradients of the lines OA, OB, OC in Figure 1 are proportional to F^ at the selected points. 
C is the point of maximum efficiency in each case. 

13. A number of deductions may be made from Figures 1 to 4, although it must be remembered 
that they are only valid for autocorrelations of the form p(j) = cos Xs. The efficiency of 
estimation of the mean has maximum values when 'z^Q ZlSp, 1*45/7, 2*5/7, 3*5/7, 4*5/7, . . . and 
minimum values occur when t = /7, Ip, 3/7, ... . The efficiency of estimation of dispersion has 
maximum values when t=j:i= 0*375/7, 0*80/7, 1*25/?, 1*75/7, 2*25/?, ... and minimum values occur 
when T = 0*5/7, 1*0/7, 1*5/7, 2*0/?, .... There is a negligible variation of these optimum efficiency 
points for the different values of h and we conclude that these rules hold true whatever the damping 
of the autocorrelation function. 





191 


194 ^ 


Independent Observations in an Autocorrelated Time Series 


P*’’ “”*• ** obtained for the mean of the 

‘’’® efficiency that can be achieved by a proper 
UeSr the dlmninor^fhf? ’ ?PPO«te applies to the estimation of dispersion. Here d» 

lighter the damping the lower the efficiency that can be secured : we conclude that the reliability 
of the mean and standwd deviation, respectively, of n observations from an autocorrelaW tin» 
SulaT “"“‘i'^tably from their normal values-using n in the no^STtaTSe^r™ 



Effective number of independent observations for dispcnsion, corresponding to n actual observa* 
tions from a series of length P periods (= T seconds). Graphs for h = 0*5 and h == 0*8 have been 
displaced vertically through P and IP respectively. 

15. Sometimes good estimates of the mean and dispersion arc required from the same observa- 
tions. The use of T = 0*375p has much to commend it. £* and have maximum values in 
this region and n** and n* are then reasonably close to their limiting values. There seems to be 
little advantage in ever reducing the time interval below about p/5. 

16. Equation (12) may be written : 

var (m,') = ^ { 1 + ? - yVf /t)} > 2o‘//7. 

It follows that the standard error of the mean square error about the parametric mean is always 
greater than that given by the normal formula assuming the observations to be independent. We 



[No. 2, 


192 Bayley and Hammersley— '^Effective'' Number of 

do not know whether the same applies to the variance of the observations. The relatively low values 
of A// that can sometimes arise emphasize the importance of selecting a few observations per series 
from as many separate series as possible, if an accurate assessment of dispersion is required. 

17. The formula nf{F, D) may be considered as applying to an infinite series. It is 
therefore important to observe the effect of introducing the correction to n*y in the practical case 



Fig. 3. 

Bias Efficiency graphs showing the effective number of observations of (£}), in a long series. 
Graphs for h = 0 5 and h = 0*8 have been displaced vertically through 100 and 200 respectively. 


of series of limited length. It will be assumed however that the series is long enough to justify the 
assumption in para 7 that Ofe'/^^) can be ignored. 

(i) /ifc* and £*. The peaks of the curves in Figs. 1 and 3 are depressed and the troughs 

arc elevated by the term 1 1 -f £))| \ The general shapes of the curves are not altered, 

and therefore the conclusions to be drawn are not altered. 

(ii) and Values of arc increased by the factor |l >- “//(F, i))| \ which is 

always greater than 1 . As t is reduced in any given series this factor tends to the constant 
value 1 1 — . The general shapes of the curves in Figures 2 and 4 are not altered 

and the conclusions are unaffected. 

(iii) n* and The effect of limiting the series length depends upon the time interval 
chosen. The incorporation of the factor 1 1 - ? //(F, p)}'* reflects the reliability of the mean 


1946] 


Independent Observations in an Autocorrelated Time Series 


193 



■ IVT. T. 

Dispersion Efficiency’* graphs showing the effective number of observations per cent. {Ev and 

in a long series. 



Fig. 5. 

Correlogram for the lateral errors of a typical anti-aircraft radar set, and exponentially damped curves 




194 Bayley and Hammersley— 7%e ** Effective'* Number of [No. 2, 

obtained from the observations. Consequently where t has values in the region of 0*375/>, 
\-A5p,2 5p, . . . the effect of the correction factor is to increase w/ appreciably. Conversely 
for T = p, Ip, 3p, . . . /j/ is diminished. The shape of the curves in Figures 2 and 3 alter 
correspondingly, but the general conclusions reached are unaffected, unless the scries is very 
short. In this event the fundamental assumption that 0(e"^0 can be ignored will probably 
break down also. Thus for very short series the use of the formulae can produce negative 
values of /?/, /i,* due to the neglect of 0(e"/*^). 

Table V 
D) 


F “ 0, or any integer, + 


D 

•00 

•50 

1 00 

01 

•49 

•51 

•99 

•02 
' -48 

•52 
•98 

•03 

•47 

I -53 

1 .97 

•04 

•46 

•54 

•96 

•05 

•45 

•10 

•40 

•60 

•90 

•15 

•35 

•65 

•85 

•20 

•30 

•70 

•80 

•25 

•75 

001 

500 

47-6 

49.4 

49-7 

49-8 

49.9 

500 

500 

500 

500 

0*02 

250 

21-2 

23-8 

24*4 

24-7 

24-8 

24-9 

250 

250 

250 

003 

16-7 

12-4 

150 

15*9 

16-2 

16-3 

16-6 

16-6 

16-6 

16-6 

004 

12-5 

8-5 

10-6 

11-5 

11-9 

121 

12-4 

12-4 

12-4 

12-4 

005 

100 

6-6 

7-9 

8-8 

9-3 

9*5 

9.9 

9.9 

9.9 

9.9 

006 

8-3 

5-5 

6-2 

70 

7-5 

7-8 

8-2 

8-2 

8-3 

8-2 

007 

71 

4-8 

50 

5-8 

62 

6-5 

6-9 

70 

7-1 

70 

008 

6*2 

4.4 

4-2 

4-8 

5-3 

5-5 

60 

1 61 

61 

61 

009 

5*5 

41 

3-7 

41 

4*5 

4-8 

5-3 

5-4 

5-4 

5-4 

010 

50 

3-7 

3-2 

3-5 

3*9 

4-2 

4-7 

4-8 

4.9 

4.9 

01 1 

4-5 

3-5 

2-9 

31 

3-4 

3*7 

4-2 

4-3 

4.4 

4.4 

012 

41 

3-3 

2-7 

2*8 

30 

3-3 

3*8 

40 

40 

40 

013 

3-8 

31 I 

2-5 

2-5 

2-7 

2-9 

3-5 

3-6 

3-7 

3-7 

014 

3-5 

30 ! 

2-4 

2-3 

2-5 

2-7 

3-2 

3-3 

3-4 

3-4 

015 

3-3 

2*8 

2-3 

21 

2-2 

2-4 

2-9 

31 

31 

31 

016 

31 

2*7 

2-2 

20 

2-1 

2-2 

2-7 

2-9 

2-9 

2-9 

017 

2-9 

2-6 

21 

1-9 

1-9 

20 

2-5 

2-7 

2-7 

2-7 

018 

2-7 

2-4 

20 

1-8 

1-8 

1-9 

2-3 

2-4 

2-5 

2-6 

019 

2-6 

2-3 

1-9 

1-7 

1-7 

1-7 

21 

2-3 

2-4 

2-4 

0-20 

2-4 

2-2 

1*8 

1-6 

1-6 

1-6 

20 

2-2 

2-2 

2-3 

0*30 

1-6 

1-5 

1-3 

1-2 

11 

10 

M 

1-2 

1-3 

1-3 

0*40 

1-2 

11 

10 

0-9 

0-8 

0-8 

0-7 

0-8 

0-8 

0-8 

0-50 

0-9 

0-8 

0-8 

0-7 

0-7 

0-6 

0-5 

0-5 

0-5 

0-6 

0-60 

0-7 

0-7 

0-7 

0-6 

05 

0-5 

0-4 

0-3 

0-4 

0-4 

0*70 

0-5 

0-6 

0-5 

0-5 

0-4 

0-4 

0-3 

0-2 

0-2 

0-2 

0*80 

0-4 

0-5 

0-4 

0-4 

0-4 

0-3 

0-2 

02 

0-2 

0-2 

0-90 

0-3 

0-3 

0-3 

0-3 

0-3 

0-3 

0-2 

01 

01 

01 

100 

0-3 

0-3 

0-3 

0-3 

0-2 

0-2 

0*2 

01 

01 

01 

200 

00 

00 

00 

1 

00 

00 

00 

00 

00 

00 

00 


... Ji,M^ 
1 - D) 


Time interval necessary for virtually independent observations 

18. As the time interval is increased /t* approaches n and the efficiency approaches loo per cent. 
The rapidity of approach depends upon the rapidity of damping of the autocorrelation coefficients. 
It is often convenient to have readings virtually independent, and Tables II and IV suggest that D 
should not be less than 4*0 and a*o respectively. Clearly it is desirable to have as large a margin 
as possible over this minimum requirement. Since D — {jlt, the condition is that should not 
be less than 0 018 and 0135 respectively. This principle is illustrated in Fig. 5. 

Fig. 1 in the paper by Cunningham and Hynd, in the preceding part of this Journal, relates to 
the elevation errors of a typical anti-aircraft radar set. The correlogram given here relates to the 
lateral errors of the same radar. Both correlograms were obtained by taking for each value of s, 
the average of ten values of r{s) obtained from ten separate series. The curves ± e"® ®* intersect 








1946] Independent Observations in an Autocorrelated Time Series 195 

the lines ± 0-135 at j = 10 seconds. On the assumption that the correlogram damps within the 
limits ± e"® ** we obtain 10 seconds as a minimum time interval (tO to give approximate independ- 
ence of observations for an estimate of dispersion. It can also be seen that for purposes of 
estimating the mean, t' would be about 20 seconds. This approach is only justified of course when 
there is evidence that the correlogram damps according to the hypothesis. In these cases when 
t>t' it seems reasonable to treat the observations as independent and to apply normal significance 
tests as necessary. 

Practical example 

19. We take as an example the particular radar set already discussed. Employing the correlo- 
gram in Fig. 1 of the paper by Cunningham and Hynd we estimate the constants 

h “ 0-1, p = 9*5 seconds. 



Fig. 6. 

Correlogram for the elevation errors of a typical anti-aircraft radar set, and a fitted curve. 

(/* “ 01,/7 == 9-5 secs.) 

The fitted curve on these assumptions is shown in Fig. 6. The average series length 150 
seconds and from this information we may readily prepare Table I. 

From a table of this nature it is possible to weigh up the relative merits of any particular time 
interval for subsequent trials. 

Tables 

20. Although extensive tables would be necessary to permit linear interpolation, we have found 
Tables II-VI sufficient for most practical purposes. In some parts of the tables (^.g.. Table II 
for low values of D) it is sometimes as well to determine the values of the functions required 
directly from the formulae which define them. 




196 


Bayley and Hammersley — The ** Effective*" Number of 


•J UT 

^ - 


o 

*0 

oop«opp^ppp'p»f>*0'^?'*r‘99*r^^ ^ 

^'♦Ov'00\Tfor^in^<s— OOvobobt^i^'O'O — OOO O 

'1" *1' 4" + -f -f + ++ 4- + 4 ++ + 4- + + 4' 4* 4" 4" 'f" 4" 4 4" 4* 


^\pp— p'<^pmri<N'4pP'^p^pprnp ppp-^pppin p 
OONOvdbt^l^t^vbsb^i r^<S— OOO O 

4444 4747 + 444 44 +4-4444- +4+ 444 + 4 4 

00 fS 

f^r^Ofsr^4rMO^ocioot^t^'i'Ovb4»«o<o»o OOO o 

ON^t^rM——— 

+4 44 44 1 44+4 4 4 h44 14 44- 4 + + +++++ + 


ppppprpprpp— pp— r^pppi^pf^ p— pCNppp'^ p 
r^r^— <sr^T^^so^o©r^vb'0^^«r>>o■4*o4^•4 fSf^— o^OOOO O 

+ 4 4 + 4+ \ 1 4- H4 1 44 1-4-4+ 4+ +4 + 4 + 4 + + 4 


pp— ppppppppppopppppppppp pp'^— pp«np p 
r^r^— p^4r'4r^Ooor^i^sb»o‘o4r4f'44p^p^ rj— — — OOO© O 
On^p'^PN— — 

+ 4 4 1 444 +44 f 4 444-44 4 + 4 4444444+ 4 

•r»v^ 

pppNppp— pppppppp— r^'oppppfi ripppppppp p 

r^t^4pNi^'4fNoebr^r^vi>4i*r»-4'44f^f^f^ <s— — — oooo o 

ON'^pnPN— — ^ f- 4 

?3 

ppppppppNppppPMpppp^ppPNp pppppopprsf^ p 

dbdb4pnr^TK^O^r^r^vb4>*^'4'^4>r^p^f^ — ooooooo O 

OS^POPM—— 

4+44+ 4 4 4+++4+44++ 4+4 4++++++4 4 

<pvp 

ppppppppS'>^ppp<Spp«o— p p'^P’rPP’T’T' 4 
oeoD— «4t^^r400>r-r^vb*^'^^'«ph»^p^<4irs —OOOOOOO O 

OS^PPNPS.^— 

4 4 1 44+44+44 444+++++4 +++++ 1 1 1 1 

OO 

rpr^ 

<nv% 

r>»r^ 

ppppppppS’+pppcNpp'o— r^p— p T* 

ooob4-psr^4r40^r^'0'i4>v%'^h^44>44i +0000000 O 

o\ m <N — »-< 

4 + 4 f 4 } 4 f 44 »- 4 4 44+4444 +4+ 1 1 1 I 1 1 

ppppppp— ppppp— pONTrppPSpp pfSrN’^l'^^x^'T^• — 

ooob— <N+'^<soobt^sb'«b4k’^'^++p^«sfs — ooooooo o 

ON'^por>i— — .... 

44+4 + 4 44-4 4-4-1 4-+ -4 44 I'-l 4 4+111111 | 

OO 

fSOC 

pp<s«oppp<^ppppMp*7-^'pppp'^ ptNvpr^ppNOp — 
db++fN+’4fSOob+'0«n*n’4’4fnpnfp>fsrs OOOOOOOO O 
Ov^p^ri——— 

4 + 4 r 4 1 -1 1 4 + + + 4 + + + + + 4^ +1 1 1 1 M I 1 

»o»n 

TPI® 

pp— r^vo<NpON'^<Spppp«^ip<sp'4— p poppppoovo«r> o 
+r'+PNr''’'4— ONOor^vb'O'+^pnpnPNf^fN— OO— +0000 O 

On p*% r'l — — — 

4 + 4+1 i 4 + 44+ 44 4+ + + + 44 1 1 1 1 M I I I 

?? 

pppPSppppp«SPSPOpp'Ntp'»pP'4 rpppPSONt^pTf p 
++0«N+P^+6N+Nb«0'<t-pn<N<N— — — OO ++— +0000 O 
ON-NtPOPN— — 

4 + 4+ 4 i 4 4 4-44 4 4 + 4 + 4 + + 4 I 1 1 1 1 1 1 ! 1 

09> 

fSfppWfSfNPp— ppppiNppp'^Pp— PN ppPNCjOprpPN— O 
+'Oob6N4o+‘OPn+000— PSPNPNP^+pn P^I++00000 O 
0\'^fS— — — 

4444-1 44 + + + + 1 I i 1 1 i 1 M 1 1 1 M M 1 4 

3S 

ppprspN- ppppinppp'oppp— — p pppsoppp^i— p p 
sb»O+00PN00»OPNOO— PNP^P^+Pn'4'4'4'4 PN— +00000 o 
ON^PN— — 

44 1 1 H 444 1 1 1 1 i 1 1 I i M 1 1 M 1 1 1 1 4 

rnr^ 

p<?\ 

ppoppppppsoopso— ippTf^pn— ptT'Vi ■NtPpppPS— pp o 
‘o+^'+oo+o— <N4^*o4»«o«o4>»n'4’44 rj— oooooo O 
On^pn— ' 

4-1 4 4444 M i 1 1 1 1 1 ! 1 1 i 1 1 I M M I 4 4 

<N00 

Op 

p— r-prvippNppppppiopp— t-'ppp pp'^PN— ppp p 

PN«)+'bopnsb++«r r^+vbvb«r>io’^4+ —ooooooo o 

ONP*^— 

44H 1-1 1 1 i 1 1 1 1 1 I 1 1 1 i I 1 1 1 1 1 1 1+4 4 

—On 

op 

ppppppppppp»opppnpppsp^pt;-p ’T’OC^T'PP+T' ® 
cjo4'0^«o«npn+^cibt^\b«n^'4pnpnfnp^p^ +0000000 O 

++ 1 TTTT7 1 1 1 M 1 M 1 M 1 1 1 1 1 1 -1 f h + 

•00 

100 

ppp^pppp— Ppppppspp'^pspppp p-^PS— ooo— o 
^j^*o+ONr>*Nb»o'4’4fnpnpspspspsps+++ oooooooo o 

T iTT 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III I++++ + 


::^Ci}S2Si2i?^9S®®’“»NPOTfiovop^eooo opppoooo o 
pppppppppM— M— pf>^ppr^oooNO o 

OOOOOOOOOOOOOOOOOOOO 0000000+ PS 


-<( c 

I 


[No. X. 


■ UiF.D) 



19A6\ 


Independent Observations in an Autocorrelated Time Series 


197 


Table VII 

Calculation of the maximum number of independent observations in a series of T seconds 

duration 


h 

/f--IogeA 

Am 

AXK> 

am 

BdU) 

Bv'm 

01 

2-303 

9-7 

0-331 

4-118 

0-177 

0-073 

0-2 

1-609 

13-1 

0-545 

3-032 

0-277 

0-327 

0-3 

1-204 

17-0 

0-772 

2-326 

0-388 

0-594 

04 

0-916 

220 

1-046 

1-795 

0-524 

0-905 

0-5 

0-693 

28-8 

1-408 

1-370 

0-704 

1-298 

0-6 

0-511 

38-9 

1-932 

1-015 

0-966 

1-850 

0-7 

0-357 

55-5 

2-785 

0-711 

1-393 

2-727 

0-8 

0-223 

88-5 

4-469 

0-446 

2-235 

4-432 

0-9 

0-105 

187-3 

9-482 

0-211 

4-743 

9-465 


nb* (max.) = - . 

T 

ftd* (max.) = - . 

P 

T 

/!„* (max.) = - . 

P 


m 

1 +^.A’(h) 
B{ h) 

B(h) 

1 ~^.B.Vi) 



198 


[No. 2, 


Average Sampling Numbers from Finite Lots 


By S. Vajda 


The following investigation is a by-product of some work connected with quality control, and it 
may be found useful to record the simple results. 

Let a lot of N items be given. We choose single items and classify them as “ effective ” or 
“ defective ” in accordance with well-defined and objective principles. The items are not replaced. 
We decide to stop at the latest when n( <[ N) items have been picked out, and earlier if we either find 
c 4- 1( <!w) defective items (this mode of termination of sampling will be called “ rejection ”), or if 
we see that it is impossible to collect c + 1 defectives even if we carry on testing up to n items (this 
case, and also that of reaching n without finding chi defectives, will be called “ acceptance ”)• 

It will be obvious that these assumptions are descriptive of a form of sequential single sampling 
from a finite population. We will here only deal with the derivation of the average number it will 
be necessary to sample — the average sampling number. 

This latter is clearly dependent on the number m of defectives in the lot, since the probability 
of finding a defective at any stage in the sampling process depends on m as well as on N. It will 
be obvious that if w = 0, then the sampling must terminate by acceptance after n — c items have 
been chosen, because these will not contain any defective. On the other hand, if m — we shall 
stop after the first chi items have all been found defective. But it would be a mistake to con- 
clude that the average sampling number will always lie between c + 1 and n — c (inclusive). We 
will return to this question in due course. 

1. Let us, then, investigate the position for any value of m(0<;m<N). The sampling may 
terminate by rejection after c + 1 -f / (/ — 0, 1, 2, . . .,n — c — 1) steps. This happens if exactly 
c defectives are found among the first c + i items and if the last item is also defective. The proba- 
bility for such an occurrence is 


c - 






•Cj being zero if either a <b or b <0 find that rejection is impossible if m^c. This is also 
obvious from first principles. 

Sampling may also end by acceptance after n — c y(y == 0, 1, . . ., c) steps. This occurs 
when there are exactly j defectives among the /i — c f j — 1 first items and when the last one is 
effective. In such a case, with only c — j items left for sampling, it is obviously impossible to find 
the further c + 1 — y defectives which would be necessary for rejection. The probability of such 
a sample is 




1 


- 

1 


n-f-r-f-l_ 1 

yv-/n + c~y+ 1 “ 




Here we find that acceptance is not possible m'^ N — n ~h c hr I- This is again as it must be, 
because in this case even if all — // unsampled items are defective, there will still be at least 
c H- 1 left among the sampled Ones. Nevertheless, we can formally use the above expressions even 
if they vanish, and we will return to this point when dealing with the final formulae. 

2. In what follows we shall use a lemma concerning the expression 


V (1) 

{"•a 

and we will prove it at this stage in order to avoid interruptions in the main development. 

We have obviously 


S Sift = 
a 


J. s r, + (5,+.-i.) 

i-a 


2 ”1“ (-y.+a — <y«+i) . s + 

' a+ 1 a-f 3 



1 ^ 4 ^ VAJDA-^Average Sampling Numbers from Finite Lots 199 


Applying this to expression (1) we have : 

= S + *+-C^ S -“'C„ 

t-a i>.a <«a+l 

4- [*+.»-2r ^ n: -"*g+... 

<-a+ 2 

== , 4* + . . . 

t *» a 


The second term of this expression is seen to reduce to (1) on substituting y — 1 and v + 1 for 
y and V respectively. Applying this procedure once more we obtain : 

^ /+*C^.2 . “-'G+a 


etc., and finally 


« — r u~ V - a 

t «= a t =» 0 


. . ( 2 ) 


which is the lemma mentioned above. The upper limits could arbitrarily be increased without 
altering the values of the expressions, but it is essential that the upper limit should be at least the 
value up to which the binomial coefficients have values different from zero. 

3. We proceed to calculate the average sampling number, say, for given N, n, m and c. 
We have : 

<=.() 

+ S cH-y) 

j^o ^ ^ 

1* - 0 i « 0 

-(c4* 1)'*’'^"'^— , + (/!- c) i 

t c- 0 < = 0 

on writing j z:rz c — i (3) 


If w <; c, the first term disappears. If, in particular, m — 0, then we have S„~ n — c, because 
only the term for i = 0 remains in the second sum. If N -- n c + 1, the second term dis- 
appears, and if, in particular, m N, then only the term for / = 0 remains in the first sum, so that 
then 5„ = c 4- 1. The first sum can be transformed into a more convenient form, however, by 

n 1 

applying our lemma (2). For this purpose we must write in the form 


We then obtain immediately 


c+ 1 
V e 

i«0 


'‘Cm- 




1 


0 tt “ c 


*'^jn - e 


c+1 

V n + 1 /^ N-nf^ 

t = 0 


( 4 ) 


The first term is seen to be ^ ^ , i, so that we have 

c+l 


. S, = (c + 1) [^ ' 1 - ‘s ' >C.+ 1 . , . * - -c„ h] 


+ («-c) S 
• i -= 0 


(5) 


We find thus that « + c+ 1, then all the binomial coefficients in the sums disappear 

and Sn becomes 

(c+i).-T^ j^TTi 


which is independent of /i. This fact was to be expected, because if m is sufficiently large, then a 
decision (by rejecting the lot) is bound to be reached at a stage before n items were investigated. 
For m == iV, we have, of course, again = c 4- 1. 



200 


Vajda — Average Sampling Numbers from Finite Lots [No. 2, 


For c == 0 or 1 the formula for reduces to the following expressions which are convenient 
for computation : 

C = 1 S, = 3f|r + l + J 

By remembering that 


it can easily be shown that for c = 0, *S« decreases (or remains equal) as m increases. However, 
this does not hold for higher values of c, as will be seen from the attached tables, which have been 
worked out for N = 10, and c = 0 and c = 1. 

It is also easy to prove that for c = 0 and w ~ 1 the value of Si is always 1, whatever N, Clearly 
this result follows also from first principles. 


Average sampling numbers 

- L - IC 1 

l 1 '-'«*+ iJ 

n ~ 10, c = 0 


m 

« “ 1 I 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0 


2 

3 

4 

5 

6 

7 

8 

9 

10 

1 

1 

1-9 

2-7 

3-4 

4 

4-5 

4*9 

5-2 

5*4 

5*5 

2 

1 

1-8 

2*42 

2*89 

3*22 

3*44 

3-58 

3*64 

3*67 

3*67 

3 

1 

1-7 

2*17 

2*46 

2*625 

2*71 

2*74 

2*75 

2*75 

2*75 

4 

1 

1-6 

1-93 

2*1 

2*17 

2*20 

2*2 

2*2 

2*2 

2*2 

5 

I 

1-5 

1*72 

1-81 

1*83 

1*83 

1*83 

1*83 1 

1*83 

1*83 

6 


1*4 

1-53 

1*57 1 

1*57 

1*57 

1*57 

1*57 

1*57 

1*57 

7 

1 

1-3 

1-37 

1*375 

1*375 

1*375 

1*375 

1*375 

1*375 

1*375 

8 

1 

1-2 

1-22 

1*22 

1*22 

1*22 

1*22 

1*22 

1*22 

1*22 

9 

1 

M 

M 

1*1 

M 

1*1 

M 

1*1 

M 

M 

10 

1 

1 

1 

1 

1 

1 

1 

1 1 

1 

1 


Average sampling numbers 

1 _ 1 A - n I 

^ '^m+1 '-'m+lJ 

10, c - 1 


m 

/I "" 2 

3 

4 

5 

6 

7 

1 ^ 

9 

10 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

1 

1*1 

2*2 

3*3 , 

4*4 

5*5 

6*6 

7*7 

8*8 

9*9 

2 

1*2 

2*36 

3*44 

4*44 

5*33 

6*22 

6*69 

7*11 

7*33 

3 

1*3 

2*47 

3*46 

4*25 

4*83 

5*22 

5*425 

5*5 

5*5 

4 

1*4 

2*53 

3*53 

3*91 

4*23 

4*36 

4*4 

4*4 

4*4 

5 

1*5 

2*56 

3*57 

3*515 

3*64 

3*67 

3*67 

3*67 

3*67 

6 

1*6 

2*53 

2*97 

3*11 

3*14 

3*14 

3*14 

3*14 

3*14 

7 

1*7 

2*47 

2*71 

2*75 

2*75 

2*75 

2*75 • 

2*75 

2*75 

8 

1*8 

2*36 

2*44 

2*44 

2*44 

2*44 

2*44 

2*44 

2*44 

9 

1*9 

2*2 

2*2 

2*2 

2*2 

2*2 

2*2 

2*2 

2*2 

10 

2 

2 

2 

2 

2 

2 

2 

2 

2 


In both tables : 

N — number in lot 
m -= number of defectives in lot 

c = maximum number of defectives in a sample leading to “ acceptance ” 
n = maximum number in sample. 





1946] V MtiK—Average Sampling Numbers from Finite Lots 201 

The functions tabulated assume, of course, the vdlucs c 4 - 1 (= 1 or 2) when m — 10, and 
n — c when — 0. We see further that in the table corresponding to c == 0 the values in all 
columns decrease with increasing m except for « == 1, when c -f 1 = « — r, and therefore the two 
marginal — and extreme — values are equal. 


Summary, 

The purpose of the preceding note is to derive formula for the average sampling number 
necessary in sequential single sampling from a finite population or lot of N items. The maximum 
number of items in the sample is fixed at /?, and sampling terminates earlier than this if cither 
c + \ defectives are found or if it becomes obvious that they could not materialize even if samp* 
ling were to continue until all n items had been drawn. 

The probabilities for termination of sampling after any given number of items sampled are 
derived. The average number thus sampled is then calculated, and tables are computed for 
c = 0 and for c = 1 corresponding to a lot of lo items. 



202 


(No. 2. 


The Use of the Negative Binomial Distribution in an Industrial Sampling Problem 

By M. E. Wise, M.A. 

(Material Research Laboratory, Philips I.ainps Ltd., New Road, Mitcham, Surrey) 

1 . Introduction 

In the binomial distribution with negative index, the probability of r occurrences is the coefficient 
of f in 

(1+p-pr)-' (1.1) 

or (I - 0'(1 - i 5 )-' ( 1 . 2 ) 

where 5 == T-f-. 

1 + P 

The mean Ix = AT/j = 

Thus the Poisson distribution whose mean is (x is a limiting case, with K infinite. 

Kendall ( 1943 ) gives several examples of frequency distributions to which it gives a much better 
fit than the Poisson distribution. It should do so in particular when individual numbers are taken 
from Poisson distributions with varying means. ♦ This case frequently occurs in practice, and the 
negative binomial should therefore be used more widely. It is hoped to show in this paper that it 
may be readily applied numerically. 

This is mostly done with the aid of the Incomplete Beta function, for which two results are 
deduced in the appendix that appear to be new. 

2 . Details of the industrial problem 

Batches of reels of enamelled wire are being tested for pinholes in the enamel. A batch is a 
consignment from one supplier, and usually has more than 50 reels each about 1,200 yards long. 
The test is on a reel over a standard length which is 50 yards to accord with a British Standard 
Specification and is drawn through a bath of electrically conducting liquid so that the insulation 
breaks dovm at every pinhole in the enamel. 

Previously a 50-yard length had been tested on each of one third of the reels. Distributions 
of numbers of pinholes were then recorded. If any length had more than R pinholes, every reel 
in the batch was tested. R was 10 for the thickest gauges of wire (28 to 32 S.W.G.), and above 32 
S.W.G. it was graded continuously to rise to 22 for the finest wire (47 S.W.G.). The problem was 
to find a more reliable criterion for accepting or rejecting the batches, which should, if possible, 
also require fewer reels to be tested. 

3 . Relating the problem to the negative binomial 

The user of modern statistics often j>erforms as valuable a service in defining a problem 
as in solving it. It is so in this case, where the quality of the product can vary continuously. 

We first consider the method of sampling. Several miles of wire are made at one time, and it 
is afterwards divided into reels. These will be thoroughly mixed by the time they reach the testing 
department. Tests on successive lengths of one wire show little significant variation compared 
with that from one reel to another, so it may be assumed that within one wire the numbers of 
holes in 50-yard lengths have a Poisson distribution with a constant mean. But there is an inevi- 
table and continuous change in conditions of manufacture leading to fluctuations, that are negligible 
over 50 yards, in the mean number of pinholes. 

Therefore if the lengths tested are all on different reels, the result should be slightly more precise 
than that from a random sample from all the wire in the batch. 

The difference should be negligible when only a small proportion of the reels are tested, for then 
the number of reels having more than one length on them tested would be too small to affect the 

* The distribution is binomial with negative index if the means have a x* distribution. 



i946l Wise — Use of the Negative Binomial Distribution in an Industrial Sampling Problem 203 

results, and it is this case for which the sampling should be most useful in saving labour. We shall,, 
therefore, assume that we are taking so-yard lengths at random from a population in which the 
number of pinholes is distributed as in (1.1). 

The wire is to be wound on a coil, with many turns in 50 yards. If two pinholes come together, 
the insulation could break down. The points in contact are at a distance measured along the wire 
that is small compared with 50 yards. From these facts we shall derive a simple approximation 
to the quality of a batch, measured as a probability of leakage, in terms of the parameters of r. 

We have to assume that there is a leak if any two pinholes overlap. If the area of each pinhole 
is 8, an equivalent condition is that the centre of one pinhole should be over a circle concentric 
to another pinhole, with twice its' radius. 

Let K be the mean number of leaks per unit length when there are r pinholes in the standard 
length /. 

Starting from the centre of a pinhole, we take a point at an assigned distance along the wire 
on its surface, that depends on the coil winding. Then the probability that this point is within 
a circle of area 48 concentric with another pinhole is 

(r - \)48 
Inal ’ 


if a is the radius of the wire. We can similarly start from each of the r pinholes in the length /, 
but this gives every leak twice, so that : 


l)48 _ r(r-\)8 
'■ 2 Inal nal 


(3.1) 


This assumes, of course, that the leaks are distributed independently of each other over the 
standard length /, which is justified according to the discussion above. 

Then over the whole length of wire the mean number of leaks h per unit length is given by 

S (3.2) 

Now 2 is half the second factorial moment of the distribution, and is, therefore, 

rmmQ 2 

the coefficient of in the factorial moment generating function obtained by putting / == 1 + w 
in the generating function (1 + — ptY^* viz. : 

(1 - -!+/(:/>« + — + • . . (sec Aitken (1939)} 

Then S ~ » p, = 1 + i) (3.3) 

r=-2 2 2 \ A/ 

so that h = 

The mean is, therefore, more important than the index in determining the quality of a batch. 
This is fortunate, because it is much more difficult to devise a simple but efficient test for controlling 
K than to find the mean, of which the distribution in a sample of N is known exactly for a given 
AT. 

In the results analysed K varied from batch to batch much less than The best values for K 
were found for many samples by the method of maximum likelihood (appendix A), they were always 
fairly near to 1*5. When K was considerably smaller, corresponding to a more variable batch, 
p. was higher. If a particular reel had a very large number of leaks, it was usually evident from a 
visual inspection that it was unsatisfactory. A similar situation arises in many quality-control 
problems, where, in solving them, it is assumed that the mean of the quantity being sampled 
changes from batch to batch, whilst its variance remains constant. 

For a given population mean (x, and K— 1-5 we could find the distribution of the maximum 
in a sample of N, and of the number of zeros. If either are larger than could be reasonably 
expected from sample fluctuations, the batch could be rejected even if its mean was satisfactory. 
This is not a very sensitive test for AT, but should be a reasonable safeguard in practice against passing 
an unusually variable distribution, 
supp. VOL. vin. NO. 2 


I 



204 


WisE-~rA^ Use of the Negative 


tNd. 2, 


4. Principle of an acceptance test for controlUng the mean 

For each gauge batches were defined to be “ bad ” if their mean number of pinholes per 50 
yards exceeded a certain pi*. (The method of assigning values to [x* will be discussed later.) The 
criterion was thai the probability of accepting on its sample mean a bad batch should be not more 
than 0 05 for any given number N in the sample. * 

However, if N was fixed in advance, there would be needless testing on good batches. There- 
fore 10 reels were usually tested at first, and if the resulting sample mean was too great for acceptance, 
N was increased to 15, and if necessary to 20, 30, and 60. As discussed more fully in paragraph 
5, such a sequential system must increase the overall risk of accepting a bad batch, but this risk 
was kept reasonably small by limiting the number of permissible values of TV. 

Obviously the probability of acceptance is greatest for a batch that is just bad, with mean jx* 
exactly. As discussed in § 3, we assume that K ^ \ S all the time, and then the highest sample mean 
m that will be accepted in a sample of N is the 5 per cent, point in the distribution of m from a 
population with mean 

To calculate m it is most convenient to work with sample totals Nm, The probability of obtain- 
ing Nm in a sample is from (1.2), the coefficient of in 

(1 - 5)^^(1 - tlY^^ (4.1) 

As pointed out in the introduction to Pearson’s (1934) tables, the sum of a finite number of 
terms of this series is an Incomplete Beta Function. The relation is 

!)...(/:+ l).,f, , , (A: + r)(/:-f r+1),, . • 1 

^ 1^-t- ^ * 7 

|V->(I -- 

= /{(r,A:)-yi (4.2) 

h 

so that the probability of r or more in this distribution is 

^Pr=-h(r,K) (4.3) 

r 


and the probability in a sample of A^that the total is Nm or more is 


f^{Nm,NK) 

corresponding to jxj is a value of 5, 5* given by 



The values of m theiefore satisfied the equation 


(4.4) 

(4.5) 


I^^iNm,NK) ^ 0-95 (4.6) 

with/: - 1-5. 

The method of solving (4.6) is given in appendix B. Solutions (Table II), were obtained for 
six different values of ^6 corresponding to various gauges (Table 1). 

It remains to say how the values of were assigned. They were chosen so that about 6 per 
cent, of a batch with mean jx* (“ just bad ”) had more than R leaks in 50 yards, with K == 1-5 and 
R having the same meaning as in § 2, and varying with the gauge number between 10 and 22 
inclusive. 


Table I 

Distributions with K == VS for various means {x* assigned to correspond to batches that are just bad. 


Gauge. 

S.W.G. 

h 


r: 2 

28-32 

0-70 

3-5 

66-3 

34 

0-72 

. 3-857 

69-2 

38 

0-77 

5-022 

76-2 

42 

0-81 

6-395 

81-7 

44 

0-82 

6-833 

83-0 

47 

0-85 

8-50 

86-8 


Percentage equal to or exceeding r 


5 

10 

15 

20 

29-2 

6-3 

1-2 

0-2 

32-9 

8-1 

1-8 

0-4 

43-4 

14-8 

4-7 

1-4 

53-1 

22-9 

9-3 

3-6 

55-6 

25-5 

10-9 

4*5 

63-6 

34-4 

17-5 

8-7 



1946 ] . Binomial Distribution in an Industrial Sanq)ling Problem 205 

Table I, obtained directly from Pearson’s Tables, gives batches that were “just bad “ with the. 
new definition for various gauges. Table II gives corresponding values of m. 

To find how the new criterion for acceptance with the above values of 5* is related to the old 
one, we require the distribution of the maximum in a sample of iV, and this is found in the next 
section. 


Table II 

Solutions for Nm ofl^^ {Nffi, 1-5) = *95. 

(Values of Nm not to be exceed in a sample of N, for gau^s in Table I corresponding to f*’s.) 
N,B . — For the practical application, obviously we only require the nearest whole number below 
and we do not guarantee that the decimal place is exactly right. 

Values of Nm for various values of N and f* : 


h 

N: 5 

10 

15 

20 

30 

60 

0-70 

7-75 

19-5 

32-9 

471 

76-5 

168-7 

0-72 

81 

21-7 

36-5 

52-2 

84-7 

186-6 

0-77 

10*9 

28-8 

48-6 

69-7 

111-5 

244-6 

0-81 

14- 1 

37-3 

62-4 

890 

143-7 

313-3 

0-82 

15-2 

39-9 

66*9 

950 

153-2 

335-0 

0-85 

19-2 

501 

83*9 

1190 

191-9 

418-3 


5. Distribution of the maximum R in a sample q/’N 

Let Qr = IfR,K) be the probability of getting at least R. Then the probability that R will not 
be attained in a sample of N is 

(1 - (5.1) 

We have obtained solutions of 

(1 - - 0-5 (5.2) 

(1~GJ^=0-9 (5.3) 

for various values of N, (Table III.) 

Table III 


Solutions of (X — = 0-5, 0-9. 


N 

Qr (satisfying (5-2) 

Qr „ (5-3) 

5 10 15 

... 0-12947 0-06697 0-04516 

... 0-02100 0-01048 0-00700 

20 

0-03406 

0-00525 

30 

0-02734 

0-00420 

60 

0-02284 

0-00351 

The corresponding percentage points of R are solutions of 





IXR,K) . . . 


. 

. (5.4) 


where m is the sample mean and x -- ip- 

m-\- K 

Table IV gives these solutions for R = 1-5 and various values of Nm, including in particular the 
nearest whole numbers below the values of Nm in Table IT. It is actually an inverse table obtained 
by interpolation from probabilities that a sample will have a maximum of R or more— since only 
integral values of R are physically possible. 

It will be noticed that the 50 per cent, values of R for samples of 30 which just pass on the new 
test correspond approximately to the values of R and would just have been rejected on the old 
system with a batch size of about 100, However, since the testing is stopped as soon as a sample 
mean satisfies Table II, there can be a considerable saving in the amount of testing of good batches. 

As discussed in § 3, we can now give a simple criterion for rejecting a sample, even if its mean 
value is satisfactory, for being too Variable. If we take the highest integral value of Nm below Nfh 
and reject any sample whose maximum exceeds its 90% R of Table IV, one sample in 10 that just 
passes the test for sample means will be rejected wrongly, and this probability increases as the popu- 
lation value of K decreases. It is possible that it does not increase rapidly enough, and in that case 
the 75 per cent, point (say) should be chosen rather than the 90 per cent, point. 

The question of a more sensitive test for K is another problem. Also there seems to be no 
published work on the problem of a joint variation in jx and K assuming no previous knowledge of 
K, 



206 


fNo* - 2 ) 1 . 


^m^The Use of the Negative 
Table IV 

50% and 90% points of the distribution of the maximum R in samples of N from populations with v^ious 
means m and K = 1’5. (N,B, 50% or 9o%of the sample R’5 are less than the values tabulated,) 



Nm 

15 

12 

10 

9 

8 

7 

6 


50 

p.c. value of R 

... 6-75 

5-7 

4.9 

4*5 

4-1 

3-7 

3-2 


90 

»» >» ••• 

... 1 F 8 

9-8 

8*4 

7-9 

7-2 

6-4 

5-8 





N 

= 10 







Nm 

39 

35 

29 

25 

21 

19 

15 


50] 

p.c. value of R 

... 10-7 

9-8 

8-2 

7-3 

6-4 

5-9 

4.9 


90 

»» »i •*• 

... 170 

15-5 

13-4 

11*8 

10*5 

9-5 

7-9 





N-- 

- 15 







Nm 

66 

60 

54 

48 

45 

42 

32 

27-5 

50 D.c. value of R 

... 13-5 

12-3 

11*3 

10-4 

9-7 

9-1 

7-5 

6-5 

90 

ff »» 

... 20*4 

18*9 

170 

15-7 

14-7 

13*8 

11-3 

9-8 




AT. 

-20 







Nm 

99 

85 

69 

66 

63 

47 

0-42 


50 p 

i.c. value of R 

... 15-5 

141 

11-8 

11-3 

10-8 

8*4 

7-8 


90 

»» *•* 

... 220 

210 

17-5 

16-85 

16*2 

12-7 

11*6 





N = 

= 30 







Nm 

335 

300 

244 

224 

168 

160 



50 p 

.c. value of R 

... 230 

20-9 

17*3 

16-1 

12-7 

12-1 



90 

»> f» *•• 

... 31-5 

28-5 

23-9 

22-0 

17-4 

16-7 




An obvious extension of the calculations is to have values of Nm for every value of N. Dr. 
Bartlett has pointed out that we would than have a complete sequential system (Barnard, 1946, 
Wald 1945) for acceptance only. For example, such a system would be obtained by solving 
(4-6) for every value of N above, say, = 10. The overall risk of accepting a batch that is just 
“ bad ” would then, of course, be much larger than 0-05, for it is the sum of probabilities of accept- 
ance at N Nu and of non-acceptance zx N ■ - but acceptance at ^ A^ 2 » for Vi = 10, 11, 
12, 13 . . . and A, ~ VH- 1. With our system = 10, 15, 20, etc., and the corresponding 
values of are 15, 20, 30, etc., respectively, and of course a much smaller additional risk results. 
We did not consider the lengthy problem of evaluating this risk, especially as the definition of a 
bad batch is itself empirical. 

Meanwhile it seems clear that there has been a considerable improvement on the previous system 
for accepting batches, and that the new one was fairly satisfactory under war-time conditions, 
in which it was necessary to test every reel in a batch which failed on the sample tests. 


MATHEMATICAL APPENDIX 


A. — Fitting negative binomials to observed distributions by the method of maximum likelihood^ 

with examples 


This section is included to show what we believe from experience to be the most convenient 
form of the maximum likelihood equations for (x and K for solving them numerically. 

Fisher (1941) calculates the efficiency of the estimate of K obtained from the sample mean and 
variance, viz. : — 


(A.l) ^ 


where r is now used to denote the sample mean and v the sample variance. A rough test applying 
this formula shows that the estimate would not be very efficient in our case. 


Writing 5 = before, Fisher’s expression for the reciprocal of the efficiency 


is : 


; - 1 -f- 2 




K+2 


+ i^‘ 


2.3 


(AT-t- 2){K+ 3) 


+ 


2.3.4 


(a:+2)(a:+ 3)(/c+4) 


•} 


Thus g =? 1*5 when 5 == 0*75 and AT = 1-5. 



t946i Binomial Distribution in an* Industrial Sampling Problem 207 

Haldane (1941) derives the equations of maximum likelihood and shows that we can easily 
solve them for p and K, It is more convenient in our case to introduce ^ for the population mean» 

so that P = 

Then from (l.l) and (1.2), if r occurs n, times in the sample of N 

log Pr= — (/* + X^) log (h. + AT) -f /• log (i. -f a: log K 

+ log(A:+ r- 1)! - log(A:-~ 1)! ~ logr! . . . . (A.2) 

I /i,log/>, (A.3) 

r « 0 

Write m and k for the maximum likelihood estimate of {x and AT. Then 

. (A.4) 

SO, as is well known, m = ?. 

0T 

The expression for ^ involves the sum sj^ defined by 

- I HrFiK + r - 1) - (AT - no)F(K ~ 1) (A.5) 

f » 1 

where the digamma function 

F(jc) = ^logr(A:+ 1). 

Then + <*® 

Haldane (1941) prefers to avoid using the digamma function. His formula is equivalent to putting 

Sk — -h //2 -f- . . . Wjj) 4- i + • • • + . . . -f -.”1 

This is no more laborious if the number of frequency groups is small, but much more so for computing 
if Ss is large. 

At = 


and from (A.6) : 


.V* 

m + k ke^ 


(A.7) 


which probably corresponds to the form which Haldane (1941) finds best for solving for Ar, according 
to his last paragraph.* 

Sk 


We have often found solutions near the minimum of ke^ as a function of k. 
for k and m are independent, for 

h'^L N{m — fx) 

dKd\L ^ (fx'l- Kf 


The estimates 
. . (A.8) 


In calculating we require values of F{r + 0), where 0 <[0 < 1, for r 0, 1, 2, 3 . . .in 
turn. For this Pairman’s (1919) tables are arranged most usefully, but only go to r 20. The 
B.A. Tables (1931) and those of Davis (1933) give values up to r 60 and 50 respectively, but are 
arranged less conveniently. It was decided to copy all the values of F{r + AC) for values of r 
from 1 to 25, with K == 0*6, 0-7, 0*8 .. . 1*5, so that all the entries for the same K were in one 
column. Values of Sk were then quickly completed on the machine. 

m was between 2 and 7 and about a tenth of the samples had A: < 1, but nearly always it was 
between 1 and 2. Much of the statistical analysis was on “ bad ” batches in which a 50-yard length 
on every reel had been tested, thus providing a large sample. 

We append three of the distributions with a summary of the numerical work. 


* There may be an error in this paragraph. Haldane’s equation (2.1) is, with our notation. 
Mlog (/If + ^) — log k} = Sk* He states that if this is multiplied by k one side increases with k and the 
other decreases. To obtain the form which we have found most convenient, one must first divide by N 
take the exponentials of both sides, and then multiply them by k. 



208 


Wise— Use of the Negative 


[No. % 


Table Va 

Observed frequencies in a batch of reels of 30 S, W.G. wire 


r 

0 

1 

2 

3 

4 5 

6 7 8 

9 

10 11 15 

16 17 

19 25 

30 43 

nr 

9 

12 

10 

20 

11 12 

6 10 5 

7 

6 2 3 

1 1 

2 2 

1 1 

nr 

(calculated) 10*8 

131 

12*9 

121 

10*9 9*7 

8*3 13*2 


22*2 


7*7 



Numerical values of : 











N Nm 

m 



5i.i 

1*5^^ 

A: lust 
above 

m 

X* 

Degr^ of 
freedom 

P 


121 726 

6*01 


238*7 

195*1 

7*53 

1*5 

0*8 

807 

7 

0*32 


More than half of x* is in the discrepant value of /ij. On the other hand, some other isolated 
irregularities in the frequencies of the larger numbers are removed by the grouping necessary to 
make all the theoretical frequencies large enough for the x‘ test. 


Table Vb 

Observed frequencies in a batch of reels of 36 S, W,G, wire 


r 

0 

1 2 

3 4 5 6 

7 

8 

9 10 

nr 

9 

6 8 

13 5 2 

0 

1 

0 1 

h 

9*2 

8*8 8*0 

5*0 


• 5*0 


Numerical values of : 






N 

Nm m 

'JLJ 

Jl.4 1*4^^ 

5,., 1*3^^ k 

X 

. Degrees of P(from 
z* freedom x") 

36 

520 2*67 

38*05 4*03 

40*15 3*965 1*28 

0*676 

3*1 

2 0*22 




Table Vc 





Observed frequencies in a batch of reels of 44 S. IV, G, wire 


r 

0 1 

2 3 4 

5 6 7 8 9 

10 

11 12 

13 14 15 

nr 

5 2 

5 4 4 

10 112 

2 

0 2 

1 0 1 

7r 

9*0 

7*1 

6*8 4*7 



3*’4 

Numerical values of : 






N 

Nm m .Vi.# 

•ij* 

e ^ si.t 

*!•« 

k 

X* 

Degrees of 
freedom P 

31 

150 4*84 56*98 

6*284 50*73 

6 161 45-98 6-168 

1*32 0*786 1*69 

1 0*22 


X® was obtained by grouping all the frequencies from 7 upwards. 

Since the distributions are of whole numbers, the values of x® are possibly overestimated, 
but in any case the fits are as good as is needed for quality control. It is clear also how inaccurate 
it would be to assume that these are Poisson distributions. 


B . — Method of solving for m the equation Ii ^(NK.Nm) — 0 where 0 is a small fraction 


Since NK) 1 — Nm) (B.l) 

putting 0 == 0*05 leads at once to the equation (4.6) in the form 

h - Nm) - 0 05 (B.*2) 


Since NK and Nm are both moderately large, Cochran’s formula may be used. (Comrie and 
Hartley, 1941 ; Cochran, 1940). This is for 5 as an unknown, but we can adapt it to solve for m 
by working with his equations in the following order : 


. ANfnK 



r = y i. 

VZ- x; NmK 

m = Kpe~*’ 


(B.3) 

(B.4) 

(B.5) 



Binomial Distribution in an Industrial Sampling Problem 


209 


where p ~ before, y == unit normal deviate exceeded by the proportion 0 of the normal 

distribution (y = 1-6449 at 6 = 0-05) 

= iO* + 3) = 0-9509 for 0 = 0-05 


= 0-39215 for 0 = 0-05. 


From (B.5) 


From (B.4) and 03.3) 


a. -^ 0 -y, >. 


So starting with an approximation to m say /Wd), it is substituted in (B.3), and z in (B.4) and W(a) 
(say) in (B.5) are calculated. Then 

m - ffia) 

aw( 2 ) 

In the cases worked out numerically, the ratio in (B. 8 ) was easily estimated and was often c. - 5 ^. 

When a series of /w’s are known for a particular 5, the calculation for a neighbouring value of 5 
may be greatly simplified when N is moderate or large. 

For if mi corresponds to pi and Zu and Wa to p^ or za, from (B.5) 

/Wa ^ P2^_ 2(e, ~ z,) (B.9) 

' Pi 

and if pa ■*“ Pi is small 

Za — Zi 4= (ma — mi) ^ which, from (B.7), is very small when mor N are not small, so that : 

(B.IO) 

mi • Pi 

This is most accurate, of course, for large AT, but once mi(60) and ma(60) had been obtained, 
the following approximation was still better, viz. : 

ma(A/^) ^ m 2 (A^i) m ii'i 

mi(iVa)^mi(7Vi) ^ ^ 

C. — An expansion for the Incomplete Beta function in terms of Incomplete Gamma functions 

We require both percentage points and percentages when 2K is not a whole number. Inter- 
polation is not easy when K is small, in either Pearson’s or Thompson’s tables. The expansion 
which will now be derived is very convenient in the range we need, and apparently is not in any of 
the published work on the function. 

fVd - ty-^dt 

We write I^iK,r) == ^ where € = — 1 

pvA,r; 

_r(A-f-r), 


h{K,r) 


where € — A — 1 


1 -. / = e-. 


Then • 


"’r(A)r(r)‘ 

(1 — e'^ye~^*^du 


. (C.l) 


where i; = — log (1 — 5) 

r is to be large and € moderate or small. So we write 


7 == f (2 sinh i«)*e‘<*’+ ^*>du 

Jo 



210 


Wise — The Use of the Negative 


[No. 2, 


(2 sinh converges rapidly as a power series in «. We obtain a Borel expansion by integrating 
(C.4) term by term in descending powers of (r -f it)*. The series is 


(2sinhiw)* 






24 ^ 5760 


(5c* - 2c) 4- 


!/• 


2903040 


^(35c9~42c*+ 16c) . . (C.5) 


Since 


u^e'^du = +1) (C.6) 

^ ^ (7 q: 4t)l + e rt<r + lod + «) + 24(r+VP'^* rt<r + i«) (3 + c) . (C.7) 
We can write this as a series of incomplete F function ratios. Put P = r + then 


nn 4., p r(P+l+ i€) [r p,(l+c) c(l-f €)(2 + e)rp,(3 4-€) 

UK 1 1- €, z' tv- pi ^ «r(p“- ic) I T(i 4- c) 24P* r(3 + «) 


(5c* - 2c)(l + €)(2 -I- c)(3 H- c)(4 + c) Tpj ( 5 ± J ) 


5760P* 


r (5 + 




{C.8) 


This converges rapidly when c* is not large compared with P* or when 5 is very small, since the 
incomplete F function ratios then converge rapidly. Pearson’s tables (1922) give values of 

f tabulated values of c and or the ratio in terms of c and x can be read directly 
from Campbell’s (1923) chart. 

When c and P are known, the first term by itself gives a good approximation to one of 5 or / 
when the other is known. It is easy to find an unknown P also, for we can obtain an even simpler 
approximation by expanding the factor outside the brackets, which is almost 1, in (C.8) 

Using Stirling’s expansion 

log IXa:) = i log 2 t: + (;c - i) log X - X + + . . . . (C.9) 

log r(P + i. + 1) - (1 + .) log P - log (P - h) = 

= p(log (l + ^) - log (l - 2»} + (i* + i) log (l - ‘ 

+ i20jp*{* ~ 4^»} • • (C.10) 

The right-hand side of (C.IO) simplifies to 


‘ (C.1 1) 

Substituting in (C.8) 

/.(jCr) = -t- + terms * in P^ P- (C 12) 

^AA,r) t 24P« \ r(K) r(K + 2) J ^ m . . (C,.12) 

(P=r + ix;-i, - log,(l - 5)) 

Thus from the first term of (C.12) we easily obtain either /, 5 or r if the other two and K are 
known. 

The author is indebted to Miss S. F. Davies for many helpful discussions on the practical side 
of the problem. He is also indebted to the Royal Statistical Society and the Scientific Computing 
Service, Ltd., for providing some of the references, and to Dr. J. O. Irwin and Dr. M. S. Bartlett 
for their helpful and encouraging criticisms of the work ; finally to Mr. J. A. M. van Moll and the 
directors of Philips Lamps, Ltd., for permission to publish this paper. 


* The next term is 


- 3)(K - 2XSK + 7) M - 10(1C» - K ) + ( X + 2XK + 3XSK - 7) 



194^ Binomial Distribution in an Industrial Sampling Problem 211 

Summary 

From a batch of reels of enamelled wire, a standard length in a sample of the reels is tested for 
pinholes. The rule for accepting or rejecting the batch depended on the maximuni in the sample ; 
by assuming that the distribution of numbers of pinholes is binomial with a negative index that 
does not vary much from batch to batch, we deduce a more reliable criterion for accepting the batch 
on results from as small a sample as possible. An approximate formula for the quality of the batch 
in terms of statistical parameters is also obtained, and the relation of the new to the old test is 
discussed. 

In the appendix are some results that can be applied generally : 

A. The method, with examples, which was found most convenient for fitting the negative 
binomial to observed distributions by solving the maximum likelihood equations. 

B. An adaptation of Cochran’s method for solving the equation in the Incomplete Beta 
function : 

h (Nm, Nk) = 6 when 0 is a small fraction and m is the unknown. 

C. A good approximation to the Incomplete Beta function in terms of Incomplete Gamma 
functions that may be new. 


References 

Aitken, A. C. (1939). Statistical Mathematics, pp. 16-22. 

Barnard, G. A. (1946). “ Sequential Tests in Industrial Statistics,” Suppt. to Journ, Roy. Statis. Soc., 
VIII., pp. 1-21. 

British Association. Mathematical Tables (1931), p. 42. 

Campbell, G. A. (1923). Bell System Technical Journal, 2, Part I, pp. 95-113. 

Cochran, W. G. (1940). Ann. Math. Statis., 11, 93. 

Comrie, L. J., and Hartley, H. O. (1941). Biometrika, 32, Part II, p. 167. 

Davis, H. T. (1933). Tables of the Higher Mathematical Functions, 1 . 

Fisher, R. A. (1941). Annals of Eugenics, 11, pp. 182-187. 

Haldane, J. B. S. (1941). Annals of Eugenics, 11, pp. 179-181. 

Kendall, M. G. (1943). Advanced Theory of Statistics, 5.13, 5.14, pp. 124-125. 

Pairman, Eleanor (1919). Tracts for Computers, No. 1, Tables of the Digamma and Trigamma Functions. 
Pearson, Karl (1923). Tables of the Incomplete Gamma-Function. 

• Pearson, Karl (1934). Tables of the Incomplete Beta-Function. 

Thompson, Catherine (1941). Biometrika, 32, Part II, pp. 151-161. Tables of Percentage Points of the 
Incomplete Beta-Function. 

Wald, A. (1945). ” Sequential Tests of Statistical Hypotheses,” Ann. Math. Statistics, 16, 117. 


12 



212 


tNo. 2, 


A Table of Lagrangian CoEFnaENTs for Logarithmic Interpolation of Standard Statistical 
Tables to Obtain other Probability Levels 

By J. T. Richardson 

{Imperial Chemical Industries^ Ltd.^ Billingham, County Durham) 

In statistical analysis it is sometimes desirable to obtain the values of statistical functions for proba- 
bility levels other than those given in the standard tables. This problem has been discussed by 
Simaika (1942), who tested several scales of interpolation. His results show that quadratic 
interpolation on the logarithmic scale gave reasonably accurate results, although more accurate 
results could usually be obtained by other methods. These other methods, however, are less 
direct and usually involve heavier computation than the logarithmic interpolation. Furthermore, 

Table of Lagrangian Coefficients for Logarithmic Interpolation 


(Column headings are the probability levels of tabular values and row headings are probability levels of the 

interpolate) 


X 

100 

50 

25 1 

X 

250 

100 

50 

X 

250 

100 

50 

X 

500 

250 

100 


— 



— 




-f- 


4- 

•+• 

_ 



+ 

+ 

35 

125 

735 

390 

70 

81 

674 

. 407 

110 

51 

1,019 

70 

175 

179 

925 

254 

36 

124 

775 

349 

71 

81 

695 

386 

112 

62 

1,020 

82 

180 

173 

945 

228 

37 

123 

811 

312 

72 

81 

715 

366 

114 

73 

1.019 

92 

185 

166 

963 

203 

38 

IIU 

843 

276 

73 

81 

734 

347 

116 

85 

1,017 

102 

190 

158 

978 

180 

39 

115 

872 

243 

74 

80 

• 751 

329 

118 

96 

1.015 

111 

195 

149 

990 

159 

40 

109 

896 

213 

75 

79 

769 

310 

120 

108 

1,012 

120 

200 

139 

1,000 

139 

41 

102 

918 

] 

84 

76 

78 

785 

293 

122 

120 

1,009 

129 

205 

128 

1,008 

120 

42 

V4 

937 

] 

57 

77 

77 

801 

276 

124 

132 

1,005 

137 

210 

116 

1,013 

103 

43 

H5 

953 

] 

32 

78 

75 

816 

259 

126 

145 

998 

143 

215 

103 

1,016 

87 

44 

75 

966 

] 

09 

79 

73 

830 

243 

128 

157 

992 

149 

220 

90 

1,019 

71 

45 

64 

977 


87 

80 

71 

843 

228 

130 

170 

984 

154 

225 

76 

1,019 

57 

46 

53 

986 


67 

81 

6*9 

856 

213 

132 

183 

976 

159 

230 

62 

,018 

44 

47 

41 

992 


49 

82 

66 

868 

198 

134 

196 

968 

164 

235 

47 

.015 

32 

48 

28 

997 


31 

83 

64 

880 

184 

136 

209 

959 

168 

240 

32 

.012 

20 

49 

14 

999 


15 

84 

61 

891 

170 

138 

222 

950 

172 

245 

16 

1,007 

9 


+ 

■1 












+ 

+ 

_ 

50 

0 

1,000 


0 

85 

58 

901 

157 

140 

235 

940 

175 

250 

0 

1,000 

0 

51 

15 

999 


14 

86 

55 

911 

144 

142 

248 

930 

178 

255 

17 

992 

9 

52 

30 

997 


27 

87 

52 

920 

132 

144 

261 

920 

181 

260 

33 

984 

17 

53 

46 

993 


39 

88 

49 

929 

120 

146 

274 

909 

183 

265 

51 

974 

25 

54 

62 

987 


49 

89 

46 

938 

108 

148 

288 

897 

185 

270 

68 

964 

32 

55 

78 

981 


59 

90 

42 

945 

97 

150 

302 

884 

186 

275 

87 

952 

39 

56 

95 

973 


68 

91 

38 

953 

85 

152 

315 

872 

187 

280 

105 

940 

45 

57 

112 

964 


76 

92 

34 

959 

75 

154 

329 

859 

188 

285 

123 

927 

60 

58 

130 

954 


84 

93 

30 

966 

64 

156 

343 

845 

188 

290 

142 

913 

65 

59 

148 

943 


91 

94 

26 

972 

54 

158 

357 

831 

188 

295 

161 

898 

69 

60 

166 

931 


97 

95 

22 

978 

44 

160 

371 

817 

188 

300 

180 

883 

63 

61 

184 

918 

102 

96 

IS 

983 

35 

162 

384 

803 

187 

305 

199 

868 

67 

62 

203 

904 

107 

97 

14 

988 

26 

164 

398 

789 

187 

310 

218 

852 

70 

63 

222 

889 

lit 

98 

9 

992 

17 

166 

412 

774 

186 

315 

238 

835 

73 

64 

241 

873 

114 

99 


997 

8 

168 

426 

759 

185 

320 

257 

818 

76 

65 

261 

857 


118 





170 

440 

743 

183 

325 

277 

800 

77 

66 

280 

840 


120 

■Ml 




172 

454 

727 

181 

330 

297 

781 

78 

67 

300 

822 


122 

102 



16 \ 

174 

468 

712 

180 

335 

317 

762 

79 

68 

320 

803 


123 

103 



23 1 

176 

482 

696 

178 

340 

337 

743 

80 

69 

340 

784 


124 

104 



31 

178 

496 

679 

175 

345 

357 

723 

80 

70 

3§1 

764 


125 

105 

25 

1,013 

38 

180 

511 

662 

173 

350 

378 

703 

81 

71 

381 

744 


125 

106 

30 

1.015 

45 

182 

525 

645 

170 

355 

398 

683 

81 

72 

401 

723 


124 

107 

35 

1,017 

62 


539 

628 

167 

360 

419 

662 

81 

73 

422 

702 


124 

108 

40 

1.018 

68 

186 

553 

611 

164 

365 

440 

641 

81 

74 

443 

680 


123 

109 

45 

1.019 

64 

188 

567 

594 

161 

370 

460 

620 

80 

75 

464 

657 


121 

110 

51 

1,019 

70 

190 

581 

577 

158 

375 

480 

599 

79 


Note , — ^The decimal point has been omitted from the table for convenience in printing. After the actual 
process of multiplication and addition has been carried out, the result is to be divided by 1,(>00 to give the 
interpolate. 






19461- , Logarithmic Interpolation of Standard Statistical Tables 213 

If a set of coefficients be prepared for logarithmic interpolation in the range i to lo these same 
coefficients can be used for any corresponding range such as o*i to i *0, etc. This makes the logarith- 
mic method of greater utility than other methods. 

The table given constitutes such a set of coefficients for use with three standard levels of proba- 
bility. The standard levels chosen are 5, 2*5, i. These levels were chosen, since they and sub- 
multiples of them are coming into general use either wholly (Thompson 1941 ) or in part (Fisher 
and Yates, 1938 ) as standard levels. The decimal point has for convenience been omitted from the 
table. This was necessary in the headings, since its presence there would lead to confusion — 
for instance, a column headed 0-025 would equally well apply to 0-25 and to 0 0025. The decimal 
is to be inserted according to the level required; thus to obtain the coefficients for the case P 
(0 02) look up 200 in the table and use tabulated values for P (0 05, 0 025, o ot), and so, on. 
The sizes of the coefficients are indicated at the head of the column or part column, and 
italics are used for negative coefficients as a further reminder to the user. If intermediate values 
are required, linear interpolation of this table will be found satisfactory. Since the coefficients 
must in each case total unity, it will be seen that omission of decimal points simply means that 
results of calculations are to be divided by 1,000 to bring the value obtained for the function 
to the correct order. By arranging the three levels used so that the required level is as near as 
possible to the centre of the range the error is reduced to a minimum. It will be seen that 
for certain values of x coefficients are given for more than one sefof standard levels. These alter- 
natives occur at the end of chosen ranges, and either may be used to suit the user’s convenience. 
Since the aim has been to retain simplicity and speed with reasonable accuracy, rather than to obtain 
high accuracy at any cost, the coefficients have been given correct to three figures. That three- 
figure accuracy is satisfactory will be seen from the following table, where errors arising when inter- 
polating in the scale of the normal deviate using the tabulated coefficients are compared with the 
corresponding errors when coefficients correct to five figures ate used. 


Probability level 

0*015 

0*020 

0*035 

0-15 

Tabled value of normal deviate 

2*17009 I 

2*05735 

1 1*81191 

1*03643 

Error of interpolate {a) 

-0*00073 

+0*00070 

-0 00083 

-0 00570 

Error of interpolate {b) 

-0*00079 

+0*00094 

-000074 

-0*00579 


{a) Interpolation with coefficients correct to five figures. 
{b) Interpolation with coefficients correct to three figures. 


Two examples of their use are given, the examples being taken so that the results can be com- 
pared with values given in existing tables. 


Example 1 . Find the value of x* for ii degrees of freedom with P =- 0 02 . 

Referring to the table, we find the coefficients for 2 given under the row heading 200, and column 
headings 500, 250, 100. The levels required in the standard tables are therefore 0 05, 0 025 and 
o-oi. The values of x* lor these levels and 11 degrees of freedom are i 9 fi 75 » 21-920, 24-725 
respectively (Thompson, 1941 ). 

The coefficients are — 139 , -M, 0 (X), + 139 . 

The calculation is therefore — 139 x 19*675 + 1 , 0 (X) x 21*920 139 X 24*726 

- 22,622 

le„ x", = 22*622 for P - 0*02 
True value — 22*618 (Fisher and Yates, 1938 ) 

The value obtained by ordinary linear interpolation is 22*855. 


Example 2. Find the value of / for 15 degrees of freedom with P — 0*3. 

The required coefficients are given under the row heading 300 and column headings 500, 250, 
100. The levels required in the standard tables are therefore 0-50, 0-25, o-io. The values of t 
for these levels and 15 degrees of freedom are 0-691, 1*197, 1*753 respectively (Merrington 1942). 





214 Richardson— Table of Lagrangian Coefficients for {No. X 

The coefficients are +180, +883, —63 
The calculation is 180 x 0-691 + 883 x M97 - 63 x 1-753 
-1071 

r -1-071 when P = 0*3 

True value — 1-074 (Fisher and Yates, 1938) 

The value obtained by ordinary linear interpolation is 1-096. 

Inverse Interpolation 

Although the table was designed for use in direct interpolation, a method has been devised for 
its use in inverse interpolation. The method consists of obtaining an approximate result, say by 
inspection, followed by successive approximations. Usually two approximations will suffice. 

Derivation , 

Suppose a function is known to have a value “ /r,” and the probability of this value is required. 
Denote by n, b and c the values given by the standard tables. These values will be those nearest 
to the above value “ /r.” Since, however, P is not known, the coefficients cannot be obtained from 
the table. Denote the required coefficients by /, m, n, 

[Then] * la mb nc — h 

[but] / + /w + /I — 1 

(In these equations it is assumed that the decimal point has been inserted.) 

.*. lb + — 6 

Eliminating the second term gives 

i(b- a)- n(c- b)--h- h 

i,e,, Al — Cn - H, Where /!, B, H are the differences from the middle standard value. 

This equation can then be used to obtain the required probability by successive approximations. 
A rough value can be obtained by inspection ; the coefficients for this value being obtained from the 
table, a new value of H is calculated, say Hi. A first approximation to P is then estimated, using 
linear interpolation with the known value H, Hi and H^, where Hz is the value of H at one of 
the standard levels. Using this value of P to obtain coefficients from the table, the process is 
repeated. It will usually be found unnecessary to proceed to a third approximation. 

In practice it will be found that the procedure can be carried out very rapidly. 

Example* Find the significance of yf — 33-2 for 20 degrees of freedom. 

From the standard tables the following values are obtained : 

P 0-05 0-025 0-01 

X* 31-4104 34-1696 37-5662 

Taking differences from 34*1696 gives 

A - 2-7592 C - 3-3966 H - 0-9696 

From inspection the expected value is approximately 0 033. 

From the table for P — 0 033 and inserting decimal point 

/ == 0-297 n - -0-078 .*. Hi - 1-0844. 

Also when P — 0-05, / — 1-0, /i — 0-0 and H = 2-7592. 

For first approximation 

P — 0-033 + | X (0 05 — 0-033) using linear interpolation 

= 0 033 - X (0 017) 

=. 0 033 - 0 0012 - 0 0318 

When P - 0 0318 , 1 = 0-249. n = -0-074, = 0-9384. 



194^ Logarithmic [nterpolation of Standard Statistical Tables 215 

For second approximation 

0 0318 — ^ (0 0318 — 0*033) using linear interpolation. 

= 00318 0 0012) 

== 0 03 18 — 0 00026 = 0 0321 rounding off to three figures. 

A third approximation should be unnecessary but is included for demonstration. 

When P = 0*0321, / = 0*261, n =: -0*075, - 0*9749 

and for third approximation P = 0 0321 + - 0 9 '^ ^ ® 

= 0 0321 + X 0 0003 

= 0*0321 -h 0*00003 = 0*0321 rounding off to three figures. 


References 

Fisher, R. A., and Yates, F. (1938). Statistical tables for biological, agricultural and medical research. 
Merrington,- M. (1942), “ Table of percentage points of the /-distribution,** Biometrika, 82, 300. 

Simaika, J. B. (1942), “Interpolation for fresh probability levels between the standard table levels of a 
function,” Biometrika, 82, 263-76. 

Thompson, C. M. (1941), “ Table of percentage points of the x® distribution,*’ Biomet rika^ 82, 187. 



216 


[No. 


Linear Sequential Rectifying Inspection for Controlling Fraction Defective 

By F. J. Anscombe 
Rothamsted Experimental Station 

Introduction 

Wald and Barnard (Refs. 5 and 1) have considered the sequential inspection of a batch of items to 
test whether the proportion of defectives is allowable. The batch is assumed to be large, and a 
relatively small number of items are inspected. After inspection the batch is either accepted or 
rejected. Any batch that is accepted is of the same quality after inspection as before — the effect 
of the inspection is simply to weed out batches of bad quality. This non-rectifying kind of in- 
spection could appropriately be applied, for example, by a producer when purchasing batches of 
components from a subcontractor. 

But the inspection of a batch of items may serve a quite different purpose, provided that the 
test is not destructive. The maker of the goods may wish to assure himself, by a final inspection, 
that the fraction defective in the batch when it leaves him will be low. If when he starts the in- 
spection he finds that the fraction defective is too high he will continue the inspection until a 
considerable part or all of the batch has been examined, the defective items found being removed, 
rectified, or replaced by good ones. The quality of the batch may thus be considerably improved 
by the inspection. The batch is never rejected. 

Such inspection may be termed rectifying inspection. Pioneer work has been done in this field 
by Dodge and Romig (Ref. 3). The present paper (which results from the author’s wartime 
interest in inspection problems while a member of the Ministry of Supply Advisory Service on 
Statistical Method, S.R. 17) gives an adaptation of Wald’s sequential methods to rectifying in- 
spection, and formulae from which a set of tables could be computed. 

Dodge-Romig Inspection 

The inspection procedure of Dodge and Romig is as follows. An initial sampling inspection 
is performed, which may be by either single or double sampling. (In single sampling a sample of /i 
items is examined, and the quality is regarded as satisfactory if the number of defectives does not 
exceed c, n and c being given integers. In double sampling, a first sample of size /i, is taken, and 
the quality is regarded as satisfactory if the number of defectives does not exceed and as un- 
satisfactory if it exceeds Cg (>Ci). If the number of defectives found lies between Cy H- 1 and Cg 
inclusive, a second sample of size Wj is examined and the quality is regarded as satisfactory if the 
total number of defectives found does not exceed Cg.) If this sampling indicates that the quality of 
the batch is satisfactory, the batch is passed as correct. If not, the batch is inspected loo per cent, 
and all defective items found are rectified or replaced. 

In deciding on the constants of the scheme two quantities are taken account of, the first the 
process average fraction defective and the second either the “ average outgoing quality limit 
{AOQL) ” or the “lot tolerance”. The process average fraction defective is the average quality 
of a long run of production before it is inspected. The AOQL is the average quality of batches 
after inspection for the most unfavourable incoming quality, /.e., the maximum average fraction 
defective of outgoing material, whatever the incoming quality. The lot tolerance is that outgoing 
quality of a batch which, whatever the incoming quality, there is at most only a small (stated) 
chance of permitting. The Dodge-Romig sampling inspection tables are so constructed that batches 
whose incoming quality is at the process average will be inspected with minimum average amount of 
inspection (taking into account the possibility of occasional loo per cent, inspection) consistent 
with the guarantee of a given AOQL or lot tolerance.* 

The lot tolerance is similar to what Barnard has called the “ risk point ” of a non-rectifying 
scheme. In practice it is necessary, or at least convenient, to express the guarantee to be given by 
an inspection scheme as a single simple condition, and for a rectifying scheme when we are con- 

* This at least was the intention, but it may not have been exactly fulfilled, since approximations were 
made in the solution. 



194q Uwar Sequential Rectifying InspectUm for Controlling Fraction Defective 217 

oentrating attention on the batches individually the lot tolerance is the obvious condition to choose. 
Dodge and Romig state, in fact, that their tables based on the lot tolerance “ have been found 
particularly useful in inspections made by the ultimate consumer or his purchasing agent for lots 
or shipments purchased more or less intermittently ” (Ref. 3. p. 38). 

The concept of AOQL was introduced to cater for the inspection of a long series of batches 
when it is important to control the average quality of output and not the quality of individual 
batches. It “ has been found particularly helpful, for example, in consumer inspections of con- 
tinuing purchases of large quantities of a product, and in manufacturing process inspections of 
parts where the inspection lots tend to lose their identity by merger in a common storeroom from 
which quantities are withdrawn on order as needed”. However, since Dodge and Romig con- 
structed their tables other schemes of rectifying inspection have been suggested (Refs. 2 and 6), 
in which the output is inspected continuously and not in batches of definite size. Such schemes 
would normally be more efficient, and we may surmise that they will largely replace batch-by- 
batch inspection as provided for in the Dodge-Romig AOQL tables. In this paper we shall be 
concerned with inspection of batches, and so the lot tolerance is the form of quality guarantee 
that will be considered, rather than the AOQL. 

In practice the initial sampling inspection in a Dodge-Romig scheme is sometimes carried 
out by a different staff from the loo per cent, inspection, and it may be more important to minimize 
the amount of sampling inspection than the total amount of inspection. In such cases the sampling 
inspection is not primarily rectifying, and should be considered by itself as non-rectifying. But 
if all the inspection is carried out by the same staff, then the principle of minimizing the average 
total amount of inspection may well be a sound one to adopt, and as an approximation we can if 
we like choose a scheme which minimizes the average amount of inspection when the incoming 
quality is at the process average. 

Now that sequential methods have been developed, it is clear that the Dodge-Romig procedure 
can be made more efficient, as to the number of items inspected, by the use of a sequential scheme 
for the initial sampling inspection instead of single or double sampling, unless sequential sampling 
is impracticable.* But if all the inspection is done by the same staff there is no need for the scheme 
to consist of a sequential inspection followed by a possible distinct loo per cent, inspection. Both 
parts can be combined into a sampling scheme that may include a large part or the whole of the 
batch. We have this state of affairs, in fact, if we use a Wald sequential scheme in which (in 
Barnard’s notation) the acceptance-handicap is small but the rejection-handicap is so large 
that the decision to inspect loo per cent, is never reached, the whole batch having been inspected 
first anyway during the sequential sampling — ^in other words, a scheme with a linear acceptance 
boundary, as in a Wald scheme, but no rejection boundary. 

We are thus led to consider the inspection procedure defined below. The mathematical treat- 
ment will concentrate on the limiting expressions when the fractions defective are very small, and 
will lead to the formulae needed for computing a set of tables similar to the Dodge-Romig tables. 
If quality is expressed in terms of the actual number of defectives in the batch instead of the fraction 
defective, schemes can be set out in double-entry tables of adequate accuracy for fairly low fractions 
defective (such as are normally acceptable), and the space saved in this way, as compared with the 
triple-entry Dodge-Romig tables, can be devoted to information on average sample size. It is 
hoped that it will be possible for such tables to be published shortly. 

Dodge and Romig state that they have investigated multiple sampling methods, and consider 
that anything more elaborate than double sampling does not justify its complexity. In defence of 
the procedure treated here it may be remarked that 

(1) Dodge and Romig were presumably unaware of it, 

(2) it is specified by two constants only, H and b (or a and P), while for example Dodge- 
Romig double sampling need's four constants, two sample sizes, /ii and Wj, and two defect- 
numbers, Cl and Ca, 

(3) whether the occasions when multiple sampling is physically convenient are frequent or 
rare, it is desirable that multiple sampling methods should be explored and made available. 

♦ It is supposed in sequential sampling that the items are inspected one by one. With certain particular 
schemes there is only a sfight loss in efficiency if the items are tested in sets of size (/> -f 1 ) together, where 
b is the “ penalty ”, in Barnard’s notation. If this is too small a number to be tested conveniently together, 
then a cruder method such as double sampling must be used. 



218 


hvacaum— Linear Sequential Rectifying 


[No/2, 


A Rectifyino Sequential Inspection Scheme 

/ 

We suppose that we are given a batch of TV items, of which an unknown number Y are defective, 
and the batch is sampled at random item by item. It is convenient to represent the course of the 
sampling by a path on an “ inspection diagram ” (Ref. 1), the point (x, y) being reached when 
X + y items have been sampled, x of them found to be good, and y defective. If the whole batch 
is sampled, the path must lead to the point (N — Y, Y), We consider the following 

Sampling rule. Sampling will proceed, and defective items found will be replaced by good 
ones, until either the whole batch has been tested or the path reaches one of the boundary 
points : 

(//, 0), {H + b, 1), {H f 26, 2), . . . {H -f ix6, jx), 

N — H 

where H and b are given integers, and (x is the greatest integer less than (See Fig. 1.) 

We can put this alternatively by saying that an initial sample of size H is taken, and thereafter 
further samples of size 6 -f 1, and sampling is discontinued if there are no defectives in the first 
sample, or one only in the first two samples, or two only in the first three samples, etc. If a score 




Fig. 1. 


is kept by adding 1 for a good item and subtracting b for a defective, the score being initially zero, 
the inspection ceases if the score reaches H. 

Given TV, the problem is to decide on H and b so that the following conditions are satisfied : 

(1) Whatever value Y may have, there is at most a risk e that after inspection the batch will 
contain F' or more defectives {Y' given). 

(2) For some range of values of F, perhaps simply when F has a given value F©, the 
average number of items tested is as small as possible. 

Principal FoRMULiE 

Theorem 1. The chance of reaching the point (or, y) on the inspection diagram is equal to 
the number of possible paths from (0, 0) to (x, y) that do not include a boundary point (other 
than (x, y) itself if that should be a boundary point) multiplied by 

F!(TV-F=n^)!(TV~ F)! 

(F-y)! TV! (TV- F-;c)! 

Here we must have y < Y,x<N--Y, since otherwise (jc, y) is inaccessible. 

Proof. The chance of reaching (x, y) by any one path is 


F(F- 1)(F- 2) . . . (F-y+ 1)(TV- F)(TV- F- 1) . . . (TV- F-x+ 1) 
TV?TV- l)(TV-2> . . . (TV-y-x-f 1) 


bapeciion for Controlling Fraction Defective 219 

Altering the order of the defectives (/.e., the order in which they occur in the series of jc 4- y observa- 
tions) merely alters the order of the factors in the numerator. 


Theorem!. Thenumbersofpaths to the boundary points at which y = 0, 1, 2, ... [x, are 
respectively 

l,/f.^/f(/f+2.Fn-l),^/f(tf+3.FrT-l){«+3. 6+1-2), . . . 
the general term for the boundary point {H + yb^ y) being 

IL ir f + 1 )/^ 

H + y(t}-V\) 

Proof. The number of paths is easily verified for y < 3. We consider now y ^ 1 . For a given 
value of y, say yo, we consider points on the inspection diagram for which ;c + y = H -f- a multiple 
of 6 -f 1 and 0 < < y©. In Fig. 2 yo = 6. The points on each constant-samplc-size diagonal are 

numbered from the top, those on the first diagonal (through (//, 0)) being numbered 1, 2, . . . yo, 
those on the second 1, 2, . . . yo — 1, etc., stopping short of the boundary point each time. 

Following the method of Stockman and Armitage (Ref. 4), we write down the number of paths 
from (0, 0) to the points on the first diagonal, and the matrices of numbers of paths from the iih 
point on one diagonal to the yth point on the next, and multiply all together to obtain the number 
of paths from (0, 0) to (H -f- yoby yo)- We get, writing now y instead of y®, 

"Cl). 


("C^"C,-i. 



(. 


1. 

6 + 1 , 


0. 

1. 




6 + li 


(>> — 1) X (y 


°) 

b+ 


■■■Li,) 


2 ) 


Sizes : — 1 X y y x (y — 1) 

The product of all the matrices except the first is the following column : 

+ \ 

KUDQ _ (* + 1) . »<»+l)-lCo 

- ( 6 + 1 ) . 


2 X 1 


■^*^“C,.i-(6+ 1 ) . I 

This is easily proved by induction. We have to shew that (writing 5 for 6 + 1) 

«c,. {Co+«c,-i.{»{Ci-5.''*-'c„>+ . . . 

Now LHS = coefficient of r in (1 + /)f(l + f)"* - W + (•/)»*-» = RHS. 

The total number of paths is therefore 

I . + - (6 + 1) . «* ' +....+ "Co . {*<* + *> C„ - (6 + 1) . 

This expression is the coefficient of t* in 


"C,.»<* + »>Co + "C, 


where we have added an extra zero term at the end. 

(1 + r)"(l + /).<*+» - (6 + l)r . (I + 0"(1 + t)***"-" 
i.e., ir+iKt + DC; _ (A + 1 ) . 

IL « + r<* + DC O F r> 

Combining theorems 1 and 2, we have the chance of reaching each of the boundary points. 
The most convenient form of the chance P, of reaching the boundary point (H + y6, y), for com- 
puting, is probably 

^ if, < r»d <Jw- «- nl 


P. 


H+y(b+ 1) 
> 0 otherwise. 



220 Anscombb — Linear Sequential Rectifying 

Limit as N — > « . From now on we consider the limiting case N - 
will be infinite also, but 

N~ ^ N 


INb. 2, 
oo.Kfinite. ^andd 


1 gj 

will remain finite, [jl is now the greatest integer less than — ^ — . It has been found that with non- 
rectifying sequential schemes formulae derived for the limiting case fraction defective — > 0 are 
sufficiently accurate for practical purposes, provided that b is not less than 10 or so, and we shall 
expect that the analogous limiting case here considered will give formulae for a and 3 equally useful. 

Applying Stirling's formula to the expression in Theorem 1, and combining with Theorem 2, 
we get 


Theorem 3. In the limit as N — > 00 , the chances of reaching the boundary points 
{olN, 0), (a -f 5 . N, 1), (i -f 23 . A', 2), . . . (a -f 1^3 • become Po, Pi. P%. • • Pi*. 

where 

Po-(i - 


and for y > 1 

P, = ^Q.a(a-f y3)*'-^(l~a-y3)*^'*' if y < T, 

P^ -0 if^ > r. 


Note that the condition x <N— T becomes the trivial one x’ <. as N — > 00 and Y remains 
finite. Thus if « > T H-* all the boundary points are accessible, however large Y is. This is 
not true in general when N is finite.. 

The average sample size A, i.e., the average number of items inspected, is given by the formula 


l-(Pi + 2F,+ . . . +tiF^)P + (l- S />,)(l-a), ... (1) 

where, for a given scheme, a and 3 are constants and the P< are functions of T, the total number of 
defectives. For values of Y not exceeding |x this expression can be simplified, for we have then 

«{! + ypd - a - + . . . + /■ . >'c; . p(a + rp)-->(l - « - + . . . 

+ yp(a + yp)*--*}. 

If the expression inside the curly brackets is expanded by the multinomial theorem and the terms 
collected, it will be found that all terms containing a positive power of a vanish, and the coefficient 
of 3** is 

- ^Ci(n ~ ir + -Ca(« - 2)’* - . . . + (~ 1)"-^ J . n !. 

Thus for y < li. 

^== a{i + yp + y(y- i)p» i- y(y- i)(y- 2 )p> + . . .} . . (i') 

If now we know the “ process curve ” of the incoming batches, /.e., the relative frequencies 
with which different values of Y will be presented, we can obtain the actual average sample size 
that will be experienced, on suitable summation of the values of A calculated for individual values of 
Y. 

The average outgoing quality (AOQ\ i.e., the average number of defectives in the batch after 
inspection, for any initial number of defectives Y, is 

P,) y- (Pi + 2F, + . . . + (XP,,). 

which for Y <\l reduces to 

y|l — the value of ~ai(Y— 1)|. 

The AOQL is the maximum value of this as y varies. 



221 


1946 ] ‘ Inspection for Controlling Fraction Defective 

The Lot Tolerance Condition 

Given the “ lot tolerance ” Y' and the corresponding risk e, we have to find schemes 
values of a and P) with this lot tolerance and with minimum average sample size for some specified 
value or values of Y. The following results are useful. 

Theorem 4. With the notation of Theorem 3, 
y 

(1) the quantity ^ (0 < H-) is equal to 1 for Y <y and is a decreasing 

function of T for Y 

(2) for any given integer T' (ill 1), the sequence of quantities 

00 - Po for Y - Y\ 

= Po + Pifor y- Y'+ 1, 

0a == Po + Pi -f Pa for y- r -f 2, 


0;. = Po -f Pi + . . . + P/. for y - r -h ii, 

has only one maximum, in the sense that we cannot have 0, ^ 0. < 0, for integers, r, s, t 
satisfying 0:<r<s</<|jL. (We can have 0, ^ 0< + 1 .) 

I have been imable to prove either of these statements, but proof is not strictly necessary, as 
they can always be verified in any particular instance. 

Assuming the truth of the statements, it is clear that the lot tolerance condition is equivalent to 
saying that max. (0<) < e, 0, being defined as above, and that to make sure that this condition is 

satisfied (for any given a and p) it is merely necessary to calculate 0o, 0i, 0*, . . . in turn until a 
maximum is reached. To economize on sample size (this being an increasing function of both 
a and P) we shall choose a and p so that max. (0<) - - e, since 0< is a decreasing function of both 
a and p (these statements are obvious intuitively if not algebraically). Thus a single infinity of 
schemes is defined. Among them there is a scries of schemes in which the maximum is attained 
by two of the 0^, namely the schemes for which respectively Oq - 0i e, 0i — 02 = e, 0^ — Og e, 
&c. The scheme with 0o ^ 0i e can be found directly, being 

a = 1 - zuy' 

As we pass to other schemes in the series a increases and p decreases. 

Thus we see that the inspection procedure is defined by two quantities, a and p, and that the lot 
tolerance condition imposes a relation between them. For any a not less than that given in 
equations (2) there is a p such that max, 0, = £,♦ and the scheme so defined will yield average 

0 < » < n 

sample sizes for varying incoming defect numbers Y according to equations (1) and (!'), and an 
overall average sample size for a given process curve. If wc consider the average sample size for a 
fixed value of Y or for a given process curve as a function of a, this function is continuous and 
consists of a series of smooth curves with corners wherever the order of the maximum 0 changes, 
i.e., comers where, as a increases^ 0a first becomes greater than Oi, 0* becomes greater than 02 , &c. 
Some specimen calculations have suggested what seems a priori plausible that the minimum average 
sample size often but not invariably occurs at one of these comers. Since it is practicable to tabulate 
only a small selection of the infinitely many schemes with any given lot tolerance, we may reasonably 
tabulate only those schemes for which ‘two consecutive 0’s are equal to c. It seems that usually one 
of these schemes will minimize the average sample size for a given process curve; if not, one of 
them will nearly do so. When the tolerance number Y' is large, however, the labour of finding the 
schemes (values of a and p) for which two consecutive 0’s are equal becomes great, and it may be 
necessary to consider arbitrary equally spaced values of a. 

* If a has the value given in (2), any value of P not less than that given in (2) will satisfy the condition 
max. 9i = e, but the value of jS in (2) leads to the lowest sample sizes when T > 0. For greater values of 
a the corresponding p is unique. 




222 Linear Sequential Rectifying Inspection for Controlling Fraction Defective ^16. 2» 

The following table is given as an example. It refers to Y' == 10, e = 0*01, and gives the first 
four schemes having two consecutive 0’s equal (/.e., 0^ == Oj = 0^ = 0j = «, 0, = 0, = e, 

03 == 0^ c). The number of such schemes is unlimited; one would merely tabulate as many as 

seemed useful. Average sample sizes and AOQL are given for each scheme. Minimum average 
sample sizes for particular values of Y appear in heavy type. 

Lot tolerance Y' ~ 10, risk e = 0-01 


Scheme 

a 

Average sample size (AIN) for Y equal to 

012 345 68 10 12 

Corre- 
sponding 
AOOL Y 

0*3690 0*1345 

0-868 0-419 0 482 0-563 0-672 0 803 0-890 0-966 0 989 0-996 

1*7 4 

0*4010 0*0784 

0 401 0-433 0 469 0-611 0-561 0-621 0-693 0-878 0-969 0-992 

2*3 6 

0*4354 0*0553 

0-435 0-459 0-486 0-516 0-660 0-587 0-630 0-736 0-879 0-973 

2*6 7 

0*4679 0*0423 

0-468 0-488 0-509 0-532 0-558 0-686 0-617 0-688 0-776 0-889 

2*8 9 


Schemes with a Curved Acceptance Boundary 

No claim has been made for the inspection procedure considered above that it is in any sense 
the best possible, but merely that it involves the inspection of fewer items, on the average, than a 
Dodge-Romig scheme with the same quality guarantee, and it is specified very simply by two 
constants. More powerful schemes with a curved acceptance boundary would be more troublesome 
to specify (and bulkier to tabulate). But it may be noted that the above work suggests a particular 
curved-boundary procedure, defined by 0o == Oi — 02 = . . . = e, where by 0,. is meant the chance 
that one of the first r -f 1 boundary points should be reached when the total number Y of defectives 
initially is Y' + r. This procedure has in fact been studied by S. N. Collings (quite independently 
of the present investigation), and it is understood that a joint paper on it is being prepared by him 
and G. A, Barnard. The relation between the two procedures has not been fully explored, but it 
appears that a Collings scheme is in general appreciably more economical of sampling than a linear- 
boundary scheme with the same lot tolerance. 

References 

' G. A. Barnard, “ Sequential tests in industrial statistics,” J.R.S.S, Suppt., Vni(1946), pp. 1~21. 

2 H. F. Dodge, ” A sampling inspection plan for continuous production,” Ann. Math. Stat., XIV (1943), 
p. 264. 

* H. F. Dodge and H. G. Romig, Sampling inspection tables. John Wiley, New York (1944). (Contains 

reprints of papers in the Bell System Technical Journal for Oct. 1929 and Jan. 1941.) 

* C. M. Stockman and P. Armitage, ” Some properties of closed sequential schemes,” J.R.S.S. Suppt., 

VIII (1946), pp. 104-1 12. 

» A. Wald, ” Sequential tests of statistical hypotheses,” Ann. Math. Stat., XVI (1945), p. 117. 

* A. Wald and J. Wolfowitz, “ Sampling inspection plans for continuous production which insure a pre- 

scribed limit on the outgoing quality,” Ann. Math. Stat., XVI (1945), p. 30. 





194 ^ 


223 


On the Distribution of the Sum of n Sample Values drawn from a Truncated 

Normal Population 

By V. J. Francis, B.Sc., F.Inst.P., A.M.I.E.E. 

iCammmication from the Staff of the Research Laboratories of The General Electric Company, 

Limited, Wembley, England) 

Summary 

This paper deals with the situation arising when a normal population has all members with 
values greater (or less) than a given value rejected, and where small samples are drawn from the 
remaining truncated population. Formal expressions are deduced from which can be calculated 
the distribution of the sum (or mean) of the sample values as a function of the point at which 
the original parent population is cut. Tables are given for the integral distribution function of the 
sum, standardised m an appropriate fashion, and their importance in solving a type of problem 
of quite frequent occurrence is shown. 


1. Introduction 

In the course of the author’s work the problem arose with which this paper deals. The 
^solution involved functions which — as far as could be discovered — had not previously been 
tabulated, and Dr. N. R. Campbell kindly undertook the laborious work involved in the 
calculation of a number of these functions. The tables which resulted are likely to prove useful 
in many allied problems, and as the matter does not seem to have been treated before — or, at 
any rate, has not been published — it has been thought worth while to place the work on record. 

The problem in question is the following: A population of members exists of which the 
relevant variate is normal or Gaussian and of which the mean and variance are known. The 
members of small samples (each sample consisting of the same number of members) have to be 
used together in sqch a way that the sum or the mean of the values of the variate in the sample 
is less than a given number. The question is : at what value of the variate should the original 
population be cut in order that when members are taken from the truncated population, the 
desired result is achieved with an allowable probability of failure? 

A simple example of such a case which might arise in practice is where the original population 
consists of a factory production of similar metal parts of which the distribution of the lengths 
<or of any other dimension) can be shown to be substantially Gaussian. Two of these parts have 
to be used together in such a way that the total length must be less than a given distance. What 
metal parts should be discarded from the production in order that apparatus containing the 
mounted components should have a reject percentage less than a certain amount? Alternatively, 
what part of the production should be discarded to achieve the minimum wastage in time and 
material when the cost of rqounting the components is taken into account? 

2. The Analytical Solution 

Let the original population be 

»' 

that is, a Gaussian distribution with variable x, the mean of which is x and the standard deviation 
^ 0 . There is no loss of generality in changing the units in the usual way so as to bring (1) into 
standard form ^ 

= ( 2 ) 

X 'x 

where y = 

•so that the sum of the variates in a sample of n is 


Sat — nx 



. ( 3 ) 



224 


Franqs — On the Distribution of the Sum of 


CNo: 2, 


Suppose that the distribution (2) is truncated at ^ — - to produce the distribution 

= • • • • <4) 

— 0 for 

and let /„(>», y©) represent the distribution of the sum of n random members of (4) ; clearly 
vanishes for y > ny^. 

It is required that the sum of the n random members of (4) should be less than y, (say) with 
a chance of failure F. We must therefore establish >^0 as a function of w, y^ and F. That is 
to say, we need the value of 

Fniya, ^.) = f f,(y, yo)dy ( 5 ) 

We may regard the sum of (m + n) random members of (4) as obtained by adding the sum 
of a first group of m to that of a second group of n random members ; thus * 

A. n(y) - [ ' yjx)f„(y -x)cix (6) 

y — 00 

In the integrand here the first factor vanishes for x > my^ and the second for (y — x) > /ly©; 
thus we may re-write (6) as 

/«+«(>») = [ fm{x)fn{y - X)dx ( 7 ) 

It is now necessary to compute the functions /i, etc., from the relations (7) and obtain their 

integrals according to (5). 


3. Calculation of the Functions 

Certain relations are useful in order to present the tabulated functions in a convenient form. 
In the first place we require the mean, a^, and the variance, of the truncated population, /j. 


If we write 

g0',)=-^[l+£>/(^)] (8) 

where Er/x — Exp(— y*)dy (9) 

we obtain for the mean of /i 

Oi=j y/i(y)dy 



and for its variance 

Oi’ = [^ (y — ad*A(y)dy = 1 + aj>’o — (11) 

Further it is a well-known result that the mean off^ will be 

— /iflfi (12) 

and the standard deviation of /„ 

<T, = OiVn (13) 

For the actual computation of ft(yo, y), (7) may be integrated giving 

(14) 


♦ For a formal proof see H. Cramer, 
1937. p. 36. 


Random Variables and Probability Distributions.*’ C.U.P., 



194Q * n Sample Values drawn from a Truncated Normal Population 225 

and this result was used to obtain the values of /j(yo» y)* For /8, yi, etc., the integrations cannot 
be carried out in the form of known functions, and numerical integration becomes necessary, ft 
could, of course, be obtained by numerical integration from/i. 

The values of 

Exp (- =^). Exp (- 5). Erf[y, - g 

appearing in (4), (8) and (14) were taken from the American Tables of Probability Functions ♦ 
at intervals of 0*2 of to 6 decimal places. 


Table 1. 
yo = 00. 

= -0-7979. = 0-6028. 


It 

nai 

OiVn 

1 

-0*7979 

0 6028 

2 

-1*5958 
^ 0*8525 

4 

-3*1915 

1*2056 

00 

- _ y - nai 
^ oxVn 

F.(z) 

F.(z) 

F.(z) 

Foo(z) 

3-0 




0 0013 

2*9 




0*0019 

2-8 



F. = 0 at 

00026 

2-7 



z - 2*647 

0*0035 

2-6 




0*0047 

2*5 



0 0000 

0 0062 

2-4 



0*0001 

0*0082 

2-3 



0*0005 

0*0107 

2-2 



0*0014 

0*0139 

2-1 



0*0030 

0*0179 



F, = 0 at 



2-0 


z= 1-872 

0*0058 

0*0228 

1-9 



0*0100 

0*0287 

1-8 


0*0012 

v00160 

00359 

1-7 


0*0068 

0 0242 

0*0446 

1-6 

Fi = 0 at 

00169 

0*0349 

0*0548 

1*5 

Z = 1*324 

0 0314 

0*0482 

0*0668 

1*4 


0*0501 

0*064*1 

0*0808 

1-3 

0 0114 

0*0727 

0*0837 

0*0968 

1-2 

0*0594 

0*0989 

0*1060 

0*1151 

M 1 

0*1072 

0*1284 

0*1313 

0*1357 

10 

0*1547 

0*1607 

0*1595 

0*1587 

0-9 

0*2016 

0*1954 

0*1904 

0*1841 

0-8 

0*2477 

0*2321 

0*2237 

0*2119 

0-7 

0*2930 

0*2705 

0*2591 

0*2420 

06 

0*3373 

0*3100 

0*2963 

0-2743 

0-5 

0*3804 

0-3502 

0*3348 

0*3085 

0-4 

0*4223 

0*3907 

0-3743 

0*3446 

0-3 

0*4628 

0*4312 

0*4143 

0*3321 

0*2 

0*5018 

0*4712 

0*4545 

0*4207 

O-l 

0*5392 

0*5105 

0*4944 

0460J 

0*0 

0*5751 

0*5488 

0*5337 

0*5000 

-0*1 


0*5859 

0*5721 

0*5398 

-0-2 



0*6092 

0*5793 


^0 is value at which orittinal population is truncated. 
a I is mean of truncated populatiori. 

a I is standard deviation of truncated population. ^ 

Fn(z) is the integral from z to 00 of the frequency distribution of the sum of n sample values, z being 
the sum expres^ as a standardised variable. 


The calculation of g, oj*, /i, Fi, f then requires only addition and multiplication. F, was 
derived from ft by numerical integration ; for integration the formulae given in Interpolation 

• Tables of Probability Functions, Vol. I, issued by the Federal Works Agency of New York. 





226 Francis— 0/1 the Distribution of the Sum of * [No* 2; 

and Allied Tables (Stationery Office), involving only odd-order differences were used. for 
n>2 involves two integrations, one of a product^./,-, giving/*,,-.,, and a second 
giving F^n-p^ If p ™ 0, the product is symmetrical about a central value; the labour of calcu- 
lating necessary products and of performing the first integration is then much less than if p has. 
any other value. Accordingly / was calculated from the formula 

A(yofy)= I - u)du ( 15 ) 

jy - iy» 

and Ft from this value of /i. Since p cannot be 0 when 2/i — p is odd, /* and F* were not 
calculated. In order to limit theJabour, the values of y were limited to those for which is not 
much greater than i; other values are not likely to be required in any problem of the kind 
considered. 


Table 2. 
yo - 0*5. 

Oi -= -0*5092. = 0*6973. 


n 

nai 

OxVn 

1 

-0*5092 

06973 

2 

-1*0183 

0*9861 

4 

-2*0366 

1*3945 

00 

Z 

Fiiz) 

Ftiz) 

F,(z) 

F«(z) 

2*8 



F4 = 0 at 

0*0026 

2-7 



Z - 2*895 

0*0035 

2*6 




0*0047 

2-5 



0*0000 

0*0062 

24 


F, ~ 0 at 

0*0008 

0*0082 

2-3 


z - 2*047 

0*0017 

0*0107 

2*2 



0*0032 

0*0139 

2*1 


0*0000 

0*0056 

0*0179 

2*0 


0*0003 

0*0091 

0*0228 

1*9 


0*0028 

0*0140 

0*0287 

1*8 

Fi 0 at 

0*0082 

0*0206 

0*0359 

1*7 

z = 1*447 

0*0167 

0*0292 

0*0446 

1*6 


0*0282 

0*0400 

0*0548 

1*5 

0*0000 

0*0430 

0*0532 

0*0668 

14 

0*0169 

0*0610 ^ 

Q0691 

0*0808 

1*3 

0*0536 

0*0822 

0*0877 

0*0968 

1*2 

0*0912 

0*1064 

0*1092 

0*1151 

1*1 

0*1298 

0*1335 

0 1334 

0*1357 

1*0 

0*1690 

01634 

0*1605 

0*1587 

0*9 

0*2088 

0*1956 

0*1901 

0*1841 

0*8. 

0*2488 

0*2300 

0*2222 

0*2119 

0*7 

0*2890 

0*2662 

0*2564 

0*2420 

06 

0*3292 

0*3038 

0*2924 

0*2743 

0*5 

0*3691 

0*3424 

0*3300 

0*3085 

0*4 

0*4086 

0*3818 

0*3687 

0*3446 

0*3 

0*4474 

0*4214 

0*4081 

0*3821 

0*2 

0*4854 

0*4610 

0*4479 

0*4207 

0*1 

0*5225 

0*5001 

0*4876 

0*4602 

00 

0*5584 

0*5386 

0*5269 

0*5000 

-0*1 


0*5761 

0*5655 

0*5398 


For meaning of symbols sec Table 1. 


In the next stage, the tables with argument y were converted by interpolation to tables with 
argument z where 

<JiV/i 


(16) 



1946] n Sampk Values drawn from a Truncated Normal Population 227 

The advantage of this prooeduie is that the tables then show how, with increasing n, approaches 
the limit 

as fn approaches the Gaussian form. 

Up to this stage 6 decimal places were retained and the maximum error appeared to be 3 in 
the last place. Finally the values were cut down to 4 places ; apart from slips, the values should 
therefore be accurate to i unit in the last place. 


Table 3. 
>^0 ^ 1 . 


fli = -0-2876. 0-7935. 


n 

nat 

OiVn 

1 

-0-2876 

0-7935 

2 

-0-5752 

1-1222 

n 

00 

z 

Fiiz) 

F,(z) 

F,(z) 

Faoiz) 

32 




0 0007 

31 




0-0010 




F4 = 0 at 


3-0 



z = 3-246 

0-0013 

2-9 




0-0019 

2*8 



0-0001 

0-0026 

2-7 



0-0003 

0-0035 

2-6 



0-0006 

0-0047 

2-5 


f, = 0 at 

0-0012 

0-0062 

2-4 


z = 2-295 

00021 

0-0082 

2-3 



0-0036 

0-0107 

2*2 


• 0-0000 

0-0057 

0-0139 

21 


0-0023 

0-0087 

0-0179 

2-0 


00056 

0-0128 

0-0228 

1-9 

Fi = 0 at 

0-0106 

0-0182 

0-0287 

1-8 

Z= 1-623 

0-0176 

0-0252 

0-0359 

1-7 


00269 

0-0340 

0-0446 

1-6 

00052 

0-0385 

0-0447 

0-0548 

1-5 

0-0293 

00527 

00578 

0 0668 

1-4 

0-0553 

0-0696 

0-0732 

0-0808 

1-3 

0-0829 

0-0892 

0-0912 

0-0968 

1-2 

0-1123 

0-1115 

0-1118 

0-1151 

M 

9-1433 

0-1366 

0-1350 

0-1357 

1-0 

0-1757 

0-1642 

0-1609 

* 0-1587 

0-9 

0-2094 

0-1942 

0-1893 

0-1841 

0-8 

0-2443 

0-2266 

0-2202 

0-2119 

0-7 

0-2802 

0-2608 

0-2532 

0-2420 

0-6 

0-3169 

0-2967 

0-2881 

0-2743 

0-5 

0-3541 

0-3339 

0-3247 

0-3085 

0-4 

0-3916 

0-3722 

0-3626 

0-3446 

0-3 

0-4292 

0-4111 

0-4014 

0-3821 

0-2 

0-4667 

0-4502 

0-4408 

0-4207 

0-1 

0-5037 

0-4893 

j 0-4804 

0-4602 

0-0 

0-5402 

0-5280 

1 0-5198 

0-5000 

-01 

. 

0-5659 

0-5586 

0-5398 


For meaning of symbols see Table 1 . 


Tables 1-5 give the values of Fi, /i, F,, F* for>^o = 0*0, 0*5, 1-0, 1-5, 2-0; F* is, of course, 
the same for all values of yo. Each column is headed by the values of nai, and aiVw, which are 
necessary to reconvert z into y according to (16). Values of Fa and F„ for n > 4, sufficiently 
accurate for many purposes, can be obtained from the values of Fi, Fj, F4 and F ^ . For n > 4 
it may be desirable to use some function of n — such as the reciprocal — for the variable in the 





228 Francis — On the Distribution of the Sum of [Nd. 

interpolation. The six figure tables for the F's and f*s and the intermediate steps are available 
on application to the author. 


Table 4. 
yo = 1*5. 

-0 1 388. Cl = 0*8789. 


n 

noi 

CiV/i 

1 

-01388 

0-8789 

2 

-02776 

1*2430 

4 

-0*5552 

1-7579 

00 

z 

F.(z) 

F»(z) 

F,iz) 

Foo(z) 

3*5 




0*0002 

3*4 



F* = 0 at 

0*0003 

3*3 



z - 3-727 

0*0005 

3-2 




0*0007 

3*1 



0-0001 

0*0010 

3*0 



0*0002 

0*0013^ 

2*9 


Fg 0 at 

0-0003 

0*0019 

2*8 


Z = 2*636 

0*0006 1 

0*0026 

2-7 



0*0010 

0*0035 

2*6 


0*0000 

0*0016 

0*0047 

2*5 


0*0003 

0-0026 

0*0062 

2*4 


0-0011 

0-0039 

0*0082 

2*3 


0*0025 

0*0058 

0*0107 

2*2 


0*0047 

0*0084 

00139- 

21 


0*0079 

0-0119 

0*0179 


F, = 0 at 




2*0 

Z - 1-864 

0*0123 

0-0163 

0*0228 

1*9 


0*0181 

0-0221 

0*0287 

1*8 

0 0081 

0*0255 

0*0293 

0-0359 

1*7 

0*0223 

0*0347 

0*0382 

0*0446 

1*6 

0*0382 

0 0460 

0*0488 

0*0548 

1*5 

0-0560 

0*0595 

0*0615 

0*0668 

1*4 

0-0757 

0*0753 

0-0765 

0-0808 

1*3 

0-0974 

0*0936 

0*0939 

0-0968 

1*2 

0*1211 

0*1144 

0*1137 

0*1151 

1*1 

0*1468 

0*1378 

0*1360 

0*1357 

1*0 

0*1744 

0*1637 

0*1608 

0*1587 

0*9 

0*2039 

0*1920 

0*1882 

0*1841 

0*8 

0*2352 

0*2225 

0*2179 

0*2119 

0*7 

0*2680 

0*2552 

0*2499* 

0*2420 

0*6 

0*3022 

0*2897 

0*2838 

0*2743 

0*5 

0*3376 

0*3257 

0*3195 

0*3085 

0*4 

0*3739 

0*3630 

0*3567 

0*3446 

0*3 

0*4110 

0*4013 

0*3950 

0*3821 

0*2 

0*4484 

0*4400 

0*4340 

0*4207 

01 

0*4860 

0*4790 

0*4734 

0*4602 

00 

0*5233 

0*5179 

0*5128 

0*5000 

-01 

0*5603 

0*5563 

1 

! 0*5519 

0*5398 


For meaning of symbols see Table 1 . 


The tables are unlikely to be used precisely in the form given for all problems; but a merit 
of that form is that it appears to be the most concise from which any information required may 
be most easily derived. Thus in one of the problems from which the investigation arose, it is 
necessary to know the relation between y and y^ for given small values of This information 
can be derived by finding from the tables, by inverse linear interpolation, the values of z corre- 
sponding to a given value of for the various values of y^ and converting these values of z into 
y. The results for Fg, Fg are shown graphically in Figs. 1 and 2, where each curve relates to the 
value of F marked against it. The curves give the value of yo* the value at which the 






194 ^ 


229 


n Sample Values drawn from a Truncated Normal Population 

Table 5. 
yo - 2 0 . 


<ii= -0 0552. 0*9415. 


n 

nax 

<yivi5 

-00552 

0-9415 

2 

-0 1 105 

1*3315 

4 

-02210 

1*8830 

00 

Z 

Fi(z) 

F,(z) 

F*(z) 

FooU) 

3-9 




0*0000 

3*8 



F4 = 0 at 

0*0001 

3*7 



z = 4*366 

0*0001 

3*6 




0*0002 

3-5 



0*0000 

0*0002 

3*4 



0*0001 

0*0003 

3*3 



0*0001 

0*0005 

3*2 


F, = 0 at 

0*0002 

0*0007 

31 


Z = 3*087 

0*0003 

0*0010 

3*0 


0*0000 

0*0005 

0*0013 

2*9 


0*0001 

0*0008 

0*0019 

2*8 


0*0004 

0*0012 

0 0026 

in 


0*0008 

0*0019 

0-0035 

2*6 


0*0015 

0*0028 

0*0047 

2*5 


0*0025 

0*0040 

0*0062 

2*4 

F, = 0 at 

0*0040 

0-0057 

0*0082 

2*3 

Z - 2*183 

0*0061 

0*0079 

0-0107 

2*2 


0*0089 

0*0107 

1 0-0139 

2*1 I 

0*0047 

0*0126 

0*0145 ' 

0*0179 

2-0 

0*0113 

0*0173 

0*0192 

0*0228 

1*9 

0*0192 

0*0233 

0*0251 

0*0287 

1*8 

0*0284 

0*0308 

0*0324 

0*0359 

1-7 

0*0393 

0*0399 

0*0412 

0*0446 

1*6 

0*0518 

0 0508 

0*0518 

0*0548 

1*5 

0*0661 

0*0637 

0*0643 

0*0668 

1*4 

0*0824 

0*0788 

0*0788 

0*0808 

1-3 

0*1008 

0*0961 

0*0956 

0*0968 

1*2 

0*1213 

0*1158 

0*1148 

0*1151 

1*1 

0*1440 

0*1380 

0*1364 

0*1357 

1*0 

0*1688 

0*1625 

0*1604 

0*1587 

0*9 

0*1958 

0*1895 

0*1869 

0*1841 

0*8 

0*2250 

0*2187 

0*2157 

0*2119 

0*7 

0*2561 

0*2501 

0*2468 

0*2420 

0*6 

0*2890 

0*2834 

0*2799 

0*2743 

0*5 

0*3235 

0*3185 

0*3149 

0*3085 

0*4 

0*3594 

0*2551 

0*3515 

0*3446 

0*3 

0*3964 

0*3928 

0*3894 

0*3821 

0*2 

0*4342 

0*4313 

0*4282 

0*4207 

0*1 

0*4725 

0 4703 

0*4675 

04602 

0*0 

0*5109 

0*5095 

0*5070 

05000 

-0*1 

0*5491 

0*5484 

0*5464 

0*5398 


For meaning of symbols see Table 1. 


distribution must be cut, in order that the probability that the sum of the values of selected 
members will exceed an assigned value shall be the value of Fa marked against the curve. 

4 . Example of the Use of the Tables and Figures 

Consider the problem of the metal parts mentioned in Section 1 . Suppose it is found that 
the parts (which are being manufactured to a nominal mean length of 0*5 inch) have, in fact, a 
mean value of 0*503 inch with a standard deviation of o*oi inch. Two of the parts are used 
together and in at least 99 cases out of 100 the total length must not exceed 102 inches. What 
should be the upper limit of acceptance of the parts? 



230 


[Na. 2^ 


pRANCMh-O/i the Distribution of the Sum of 

Here ao = 0 01 ; J — 0*503 ; n ^ 2. 

From equation (3) we must have the sum of the lengths in a random sample of two less than. 


with a chance of failure of o oi. 

Reference to Fig. 1 shows that for these values of y, and Fi the value of yo is 0 9 , so that alt 
parts longer than 0*512 inch should be rejected. 



ft Hives the chance that the sum of the values of the variate in samples of two is greater than yt. 

Fig. 1. 

Value O'o) at which normal population should be truncated. 


Now suppose that a test is applied to the mounted components and all that fall outside the 
limit of I 02 inches are rejected. Suppose, further, that the nature of the process in which the 
metal parts are used makes it impracticable to select them in pairs. The number of mounted 

rejected on final test will, of course, depend upon the acceptance 
limit of the individual components. The effective cost of mounting a component, we will say, is 
® "manufacturing cost, so that the value of a pair of mounted components is 20 times 
!n „ ^ “““.‘acture of a single component. What should be the acceptance limit for the parts 
in order to obtain a minimum cost of manufacture of the mounted components? 

^ parts of length greater than o- 5 i 3 inch are 

rejected. From the value of in Table 2 we find that this involves the initial rejection of 




1?461 n Sample Values drawn from a Truncated Normal Population 231 


i6 per cent, of the components. To show the use of the Tables, we will find from Table 3 the 
percentage of the mounted components to be rejected. We find that a, = - 0-288 V2 <t. = -4-112 
so that from equation ( 18 ) i -r* i.<. 


1 - 4 -f 0-576 
1~T2 


1-76 


and interpolating in Table 3 we find that, approximately 

F,(z) = 0-021. 



y<, 

Ft gives the chance that the sum of the values of the variate in samples of three is greater than y,. - 

Fig. 2. 

Value (^o) at which normal population should be truncated. 

Therefore 21 per cent, of the mounted components have to be rejected. Since in this example 
only rough values arc required, the result could be obtained also by interpolating in Fig. 1 ; we 
then find for what value of F'a the curve passes through the point y, = 1 * 4 , yo — 1 * 0 * ft is neces- 
sary to use the Tables when accurate values are required. 

If the cost of each metal part is c and we start with n such parts, after the initial rejection 
0-84/1 parts have to be mounted and 0-84/1 x 0-979 parts arc finally used. The total cost of 
these 0-82/1 parts is 


nc + (0-84/1 X 9c). 




232 


The Sum of n Sample Values drawn from a Truncated Normal Population tNo. 2 , 

Thus the cost per part filially used is iO’43c. In Fig. 3 the full curve shows the variation of this 
cost with yo* Th© cost is given by the scale on the left. The scale on the right for the dotted curve 
gives the percentage of parts rejected as a result of the first acceptance test. It is seen that the 
minimum cost is about io-3c, and it occurs for a value of y© = 0*75 corresponding to an accept^ 
ance limit of 0*5105 inch. In this particular example it would, of course, be absurd to consider 



reducing the acceptance limit below yo 0 * 7 , since the possibility of getting a final value for two 
components greater than 102 just becomes zero when everything greater than 0*510 is rejected 
on first test. 

The calculations do show, however, that in the case considered it is necessary to set the initial 
acceptance limit at a value which gives practically no rejects on final test. In other cases of 
practical importance the minimum cost is obtained when there arc rejects both on initial and final 
test. 




1946] 


233 


Statistical Control Applied to High ‘Duty Iron Production 
B y E. W. Harding 

[Paper discussed before the London Group of the Industrial Applications Section of The Royal 
Statistical Society, January 26th, 1945.) 

Introduction 

The use of statistical control in foundry work is somewhat of a departure from the type of application 
which has been most frequently discussed at public meetings and in the technical press. For this 
reason it is felt the subject will provide some new interest and provoke discussion. The subject 
will be dealt with from the point of view of the technical man applying statistical methods to a 
branch of industry sorely in need of help in its control problems. 

The application which will be described is that of controlling metal quality in foundries making 
Meehanite Metal, a high-duty cast iron. The problems present novel features of statistical control 
in industry. The control system developed by the author for these foundries will be described, 
together with the results obtained in practical operation. 

The Control System 

The development work in applying the control system was carried out within a group of some 
15 foundries producing Meehanite Metal. Each foundry operated under widely different conditions 
of output, equipment, control facilities, types of metal, etc., and this added to the difficulties of 
establishing a standard system. On the other hand, a favourable factor was the centralized and 
uniform system of technical control prevailing for the whole group. 

A brief description of the operating conditions will help in understanding the nature of the 
problems involved. The metal is melted in a vertical shaft furnace, called a cupola. Solid metal 
and fuel (coke) are charged continuously in at the top of the shaft. Combustion takes place at 
an intermediate zone by forced blast entering through air-ports lower down the cupola. The molten 
metal drops to the bottom of the shaft, from where it is tapped into ladles, either continuously or 
at intervals. There is a continuous supply of molten metal, and this metal, taken away in ladles, is 
subsequently poured into moulds to make Meehanite castings. Each ladle of metal is tested for its 
suitability for the particular type of casting to be produced, but in the time available this test can 
be qualitative only, providing an indication of metal structure obtainable in the casting. The 
complete testing for chemical and physical properties is carried out later. 

The aim of technical control is to ensure standard chemical and physical properties in the metal 
produced. The complexity of the problem arises from the many sources of variation present; 
raw materials variation, charge weighing, charging practice, variable combustion conditions, etc., 
and the varying extent to which these operations are dependent on the human element. 

In the past the problem has been approached entirely by judgment applied to the various 
operating controls ; for example, control of raw material composition, accuracy of charge weights, 
charging procedure, blast volume measurements, etc. These controls are subject to special super- 
vision, and every effort is made to keep variations from standard practice within reasonable limits. 
The method, however, failed to distinguish between variations in the final product due to causes 
inherent in the process and operating conditions and those due to assignable causes. It also failed 
to relate cause and effect — that is, the operating practice and the result in terms of metal test values. 

In these circumstances there was always difficulty in deciding when corrective action should be 
taken to bring test values back to standard. This, again, was a matter of personal judgment, and 
often the correction only resulted in a more serious variation. Also, the nature of the corrective 
action to be applied was seldom clearly indicated. 

A further limitation in the normal method of control was in regard to measuring the standard 
of control in operation over any given period or for any particular foundry. It was not possible 
to obtain a quantitative measure of control, by which the progress of a foundry could be accurately 
assessed or comparison made between one foundry and another. The nearest approach obtainable 
was an estimate, based on human judgment, of the standard of operating practice. 



234 ViAmiSQ^Stat}stical Control Applied to High Duty Iron Production 

However, as is frequently the case 
mass of test data which was not bein 
Meehanite Metal normally takes from 
testing, depending on the number of 

It was evident that these test results i. j • a 

step in applying statistical control was the collection of test data from all the foundries wnceraea 
for the previous six months. These consisted of chemical and physical test results for the various 
types of Meehanite Metal produced, and in each case values for Average and Standard DeviaUott 
were obtained for each property and each foundry. 

From this work, and even at this early stage in the development of the control systein, some very 
interesting and useful information was obtained. Differences betwwn foundries in variation values 
were shown clearly, and it at once became evident that a sound basis for measuring the standard of 
control in operation at each foundry had been found. It was noted also that there was little or no 
relation between the magnitude of the test values (Average) and the variation value (Standard 
Deviation). This meant that a standard value for variation could be set, regardless of the type of 
metal produced. It was also observed that the standard of control was not dependent so much on 
the control facilities available at the foundry as upon the r.onscientious and consistent effort on 
the part of the control staff to operate to standard practice. Further it was found that in general 
the standard of control in operation was lower than had been assumed. 

The next step was the setting of standards to be applied to the whole group of foundries. 
Standards were already in force for the magnitude of chemical and physical test values for each type 
of Meehanite Metal. Standards were now set for permissible deviation from these values, ba^ 
on what the best controlled foundries could do and had already done. That is, Standard Deviation 
values were established in Meehanite practice for the principal chemical and physical properties. 
The values for the various properties are shown in Table I. 

Table I 

Standard Deviation Values for Meehanite Metal 

Total carbon 0 08% 

Silicon 0-1 2% 

Manganese 006“ « 

Tensile strength 100 tons/sq. in. 

At this point interest was aroused in testing errors. It was felt desirable to establish to what 
extent these errors, rather than true metal variation, were responsible for the total variation- 
This applied particularly to tensile testing which was suspected of being unreliable. A preliminary 
aeries of tests was run at seven different foundries, and in each case six tensile bars were cast from 
the same ladle of molten metal, under controlled pouring, so that the conditions were the best 
obtainable for uniform results. Machining and testing of these bars was then carried out according 
to usual practice for the foundry concerned. The results obtained are set out in Table II. 

These results indicated that testing errors were variable and generally large; in some cases 
higher than the total permissible metal variation. Steps were taken to reduce these errors, and 
subsequent control checks showed that for good average condition the normal Range for tensile 
testing errors for samples or groups of six was from 0 50 to 100 ton/sq. in. That is, of the total 
permissible variation due to metal and testing, represented by a Standard Deviation value of 
I 00 ton/sq. in., the variation due to testing is equivalent to a Standard Deviation value of from 
0-2 to 0*4 ton/sq. in.* This still appears a high proportion of the total, but is the best that can be 
done with present testing technique and equipment. Errors in chemical testing were less serious* 
but in some cases required controlling and reduction. 

The establishment of the control system finally adopted followed on the above preliminary 
work. However, a number of questions had to be settled before control chart work could be 
started. These included : 

1. Measure of Variation to be Used, Range was adopted chiefly on account of greater 
ease in calculation than for Standard Deviation — an important point in an application of this 
kind. 

* In this class of work the testing eiror of necessity includes any variability in metal quality of bars 
cast from the same ladle. 


in industry, there was available at each of these fountow a 
ig sufficiently utilised. Each foundry engaged in produi^g 
three to six samples of the metal melted per day for comply 
types of metal produced and the quantity of metal melted. 
ef otief ffpjif'tTifinL and the first 





1946} 


Harding — Statistical Control Applied to High Duty Iron Production 

Table II 

Consistency Tests for Tensile Testing 


235 


Foundry No. 

1 

2 

3 

4 

5 

6 

7 

Type Meehanite 

Test bar dia. : 

GD 

GC 

GA 

GD 

GC 

GD 

GB 

As cast, inch 

0-875 

0-875 

1-20 

0-875 

0-875 

0-875 

0-875 

Machined, inch 

0-505 

0-564 

0-798 

0-505 

0*564 

0-564 

0*564 

Testing machine 

Avery 

Buckton 

Avery 

Buckton 

Buckton 

(hand 

operated) 

Buckton 

Avery 

Tensile obtained, tons/sq. in. : 







Bar No. 1 

17-70 

18-40 

25-25 

17-00 

19-86 

17-55 

23-00 

2 

18-00 

19-20 

26-47 

15-75 

20-00 

17-68 

22-70 

3 

17-93 

19-84 

25-35 

18-90 

19-29 

17-80 

21-80 

4 

16-63 

19-16 

23-26 

17-50 

18-11 

17-26 

22-60 

5 

17-06 

19-04 

24-85 

20-00 

19-11 

17-43 

22-00 

6 

17-46 

19-72 

22-20 

17-70 

18-42 

17-40 

21-70 

Average 

17-46 

19-23 

24-56 

17-81 

19-13 

17-52 

22-30 

Range 

1-37 

1-44 

4-27 

4-25 

1-89 

0-54 

1-30 


2. Sample Size, Considerations of sampling procedure, time of completion of a group 
of tests, made for the eventual choice of sample size of six. 

3. Control Chart Limits, 99 8 per cent, limits were adopted, as being the simplest to 
operate and understand. No inner limits were used in the early stages of this work. 

4. Basis of Chart, That is, whether the chart averages and limits should be based on the 
past performance of each individual foundry or on a single set of standards. The Standard 
chart was adopted, because the essential point in Meehanite practice is that, for each type of 
Meehanite Metal, all foundries shall produce the same material and operate to the same standard 
of control. 

5. Sampling Procedure, A Standard sampling practice was found to be difficult owing 
to the varied conditions of operation, especially in regard to tonnage produced, tap weights 
and number of types of Meehanite melted. For example, a foundry operating with a large 
number of small tap weights would tend to show a lower ratio of weight of metal tested to 
total melt than a foundry using a small number of large tap weights. In extreme cases of the 
latter kind almost complete 100 per cent, sampling could be obtained, i.e., all taps were 
sampled. An example of this is seen in blast-furnace practice, where, in fact, there is 100 
per cent, sampling. However, it was found possible to establish a maximum and minimum 
sampling ratio which would meet all cases ; that is, from i in 5 to i in 10, with i in 7 as 
standard practice. For example, a foundry melting 15 tons of the same type of Meehanite 
Metal and tapping the molten metal in i-ton lots (that is, 15 ladles of i ton each), would sample 
and test two ladles, thereby completing one group in three days for that type of Meehanite. 
It will be realized that each ladle of metal is assumed to be homogeneous and therefore the 
sample is an estimate of the properties, not of the material in the ladle, but of the heat or melt 
as a whole. That is to say, the population is the total nietal melted, of which the ladles represent 
the individuals. The sampling ratio is thus reasonably high and, as a further assurance of a 
reliable estimate of product properties, a special kind of representative sampling was adopted. 
This consists of sampling in rotation, whereby, instead of sampling always from a fixed 
position, the position the ladle sampled is progressively altered. Thus, Tap 1 on the first 
day. Tap 2 on the second day, and so on to Tap 7 on the seventh day, after which a return is 
made to Tap 1 . 

6. Test Results to be Charted, In order to make the system simple and easy to operate 
without undue clerical work, only the most important properties are charted. These are 
Total Carbon and Silicon, representing chemical composition and tensile strength for the 
physical properties. For convenience, the Average and Range of each of these three properties 
are put on the one sheet. This also assists chart interpretation, as will be explained later. 
With each foundry producing normally two to three main types of Meehanite Metal, each 

SUPP. VOL. VIII. NO. 2 K 







236 


Harding — Statistical Control Applied to High Duty Iron Production (No. 

foundry has usually two control sheets, and sometimes three or more, to operate. They are 
not encouraged to chart more than three types of metal, since it rarely happens that three 
mixes represent less than 75 per cent, of the total foundry output. There are good reasons 
for this, mainly in that the standard of control can accurately be assessed on 70-80 per cent, 
of the output ; that, in general, satisfactory practice for this proportion of the melt ensures 
correct practice and provides a reasonable assurance of quality for the remainder ; that small 
quantities of special mixes require too long a time for completion of groups, and that a multi- 
plicity of charts tends to confuse the issue, and makes for more clerical work than is desirable, 
especially in the present stage of development. A minimum of two control sheets is preferre^, 
since the chart results for a single mix, unsupported by those for other mixes, are less easily 
interpreted. 

The control chart finally adopted takes the form of a sheet measuring 20 inches high by 16 
inches wide, on which group values for Average and Range of the three properties are plotted. 
This chart covers from six months to one year’s operation. Modifications have been made as ex- 
perience is gained with the system, and further changes are expected, particularly in regard to 
reduction of standard control limits, as operating practice and control improves. 

The installation of the system presented no difficulty. A standard Instruction Sheet was issued 
explaining the system in very simple language and co-operation in putting it into practice was readily 
obtained. 

Practical Results Obtained from System 

Early results, charted before the system was actually in operation, showed clearly many failures 
to conform to Standard and a serious lack of control. One example is shown in Fig. 1 . Figs. 2 


MEEHANITE CONTROL CHART 



7 


% 



CIIOUI* NO 


• i « ii io n u M It X) M >4 40 

MAtCH AMIL MAY JUN( JWtt AUfrUftT SCPT OCT NOV MC 


Fig. 1. 

Illustrating type of result obtained before installation of control system. 


and 3 show the very definite improvement obtained almost immediately after putting the system 
into operation. This improvement took the form, both of a closer adherence to Standard values 


194 ^ 


Harding — Statistical Control Applied to High Duty Iron Production 


237 


MEEHANITE CONTROL CHART 

~ MCCHANITC *CC’ 
TENSILE 



I»n S • 5 « ; *6 ti > n n %o n M Jtk 

•^‘''*194^^"' mabcm may JtiNC July aucmat $trT. oct hoy 6ec 

Fio. 2. 

Showing improvement in standard of control from introduction of chart system (from July). 


PROCESS * *Cil"MEEHAMlTg 



Fig. 3. 

Showing present results with improved standard of control. 

for the properties (Average charts), so that the groups fall completely within the limit lines, and also 
of a considerable reduction in product variation (Range charts), giving in some cases Range values 
far below the Standard values. 

This remarkable improvement was due to two main factors. Firstly, the psychological effect 
of this form of chart, which by drawing attention to both Average and Range values produces 
a conscious effort to meet the chart limits. Secondly, the control charts give early warning of an 



238 Hkkdwg-— S tatistical Control Applied to High Duty Iron Production [No. 2, 

impending change in product properties and breakdown in control. They also provide information 
as to the probable cause of the change. Fig. 4 illustrates how persistent trends in product test 
values can occur without attracting attention until too late to prevent an undesirable change in 
quality. With the control charts, these trends become immediately apparent and a breakdown in 
control can thus be averted. 



2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 54 36 38 40 


SAMPLE NUMBER 

CONTROL CHART FOR SILICON IN MEEHANITE *CD' MCEHANITE STANOWRDS 

Fig. 4. 

Showing typo of trend experienced in test values difficult to detect without control chart. 

The relation between a chart based on past performance of an individual foundry and the 
Standard chart is illustrated by Figs. 5 and 6 respectively. The examples show that a foundry may 
maintain its product in a state of control, based on its own past performance, and yet show a com- 
plete breakdown where the state of control is based on a standard performance, i.e., a process 
may be controlled at an unsatisfactory level. The examples show a lack of conformity in regard to 
Averages values. A similar lack of conformity may be found in Range values, or in both Range and 
Average. 

The principal benefits from the system are associated with the early correction that can be 
applied to check off-standard tendencies. However, it has also proved useful in investigations as to 
causes of trouble. An example of this is seen in the case of a foundry troubled with high Range 
values in Total Carbon content. In this case the Averages were reasonably satisfactory and difficulty 
was experienced in tracing the source of the high day to day variations. By re-grouping the in- 
dividual results, according to the period at which they were melted in the heat and plotting the 
groups so obtained on separate control charts (shown in Fig. 7) it became evident at once where the 
trouble lay. The average values for the latter portion of the heat were found to be consistently 
off-standard and the remedy was immediately apparent. 

In general, the higher standard of control obtained since the introduction of the system is 



1946] HAXDnHQ^Statistical Cmtrol Applied to High Duty Iron Production 239 

practical proof of the value of this control system. Its possibilities have not yet been exhausted^ 
and further benefits will undoubtedly result from greater experience in its operation. For example, 
accurate chart interpretation is as yet far from full development, and it is in this direction that 
progress in the immediate future will lie. Special attention is being given to interpretation technique, 
and some notes on the progress made will be given. 



2 4 6 a K> 12 14 16 16 20 22 24 26 28 30 32 34 


sample number 

CONTROL CHART FOR TENSILE ON MEEHANITE ‘GC’— BASED ON PAST PERFORMANCE. 

Fig. 5. 

Showing results on chart based on foundry past performance (compare with Fig. 6). 

Chart Interpretation 

In all control chart work a point on which emphasis has been laid has been the importance of 
relating cause and effect. The charts themselves are only a means to the end — that of correcting 
faulty operating practice causing product variation. They can serve this purpose only if they can 
be interpreted correctly and the necessary corrective action applied in practice. Hence the technique 
of chart interpretation must be developed and mastered. 

A glance at chart results gives a general indication of the state of control, and this is useful in 
assessing the control position in the foundry. But for the real work of maintaining a state of con- 
trol, averting an impending change or correcting a change that has taken place, a close study of the 



240 Hardincj — Statistical Control Applied to High Duty Iron Production (No* 2» 

chart points is essential. Only in this way can the hidden clues as to operating causes be extracted* 
It is, in fact, surprising how much information of a technical nature can be gained merely by 
deductions drawn from control chart clues. 

(a) A Guide to Chart Interpretation^ In order to assist Meehanite foimdries in this work, a 
Guide to Chart Interpretation has been drawn up. This is based on experience gained over some 
two years’ working of the system. It is by no means complete, but it does tie-up cause and effect 



2 4 6 8 lO 12 14 16 18 20 22 24 26 28 30 32 34 

SAMPLE number 

CONTROL CHART FOR TENSILE ON MEEHANITE 'CC ' — 

meehanite standards 

Fig. 6. 

Same results as in Fig. 5 but on Meehanite Standards Chart (note also comparison between Standard 

Deviation and Range). 

in such a way that it is now possible from the charts to establish the technical factors responsible 
for a change, in most cases without even going into the foundry. To appreciate this point it must 
be realized that a change in any one of the properties of the product may be caused by one or more 
of a large number of operating factors. 

The procedure followed in tracking down the operating factors responsible for a change is 
mainly one of elimination. It resolves itself into putting the following five questions : — 



194 ^ 


HAmn^i-^tatisiical Control Applied to High Duty Iron Production 241 

1. Does the change occur on all mixes (that is» types of Meehanite) or on only one mix? 

2. Does the change occur on all properties of the mix or mixes affected, or on only one 
property? 

3. Are Average values or Range values, or both, affected? 

4. Is the change a gradual one or is it an abrupt fluctuation? 

5. What degree of correlation exists between chemical and physical properties ? 





SAMPLE NUMBER 

CONTROL CHART FOR RANGE OF TOTAL CARBON IN MEEHANITE ‘EIS' 


Mfthdnit* Standards 


Fio. 7. 


Showing Range values before and after regrouping results. 


To understand the procedure, it must be realized that each foundry operates at least two control 
sheets, each of which gives results for Average and Range of three properties of a mix, or type of 
Meehanite. Consequently, the results for each property of one mix can be compared for the 
corresponding results on another mix, and advantage is taken of this fact to verify suspected causes 
of trouble. It will be possible here to give only a bare outline of the steps involved in locating the 
cause of trouble. The questions will be examined briefly in turn. 

1. If the change is confined to one mix, then factors specific to that mix are examined;, 
for example, mix composition, raw materials confined to that mix, position in the heat, etc 



242 Harding — Statistical Control Applied to High Duty Iron Production tNo. 2* 

If the change is common to all mixes, then attention is concentrated on factors common to all 
mixes. 

2. If the change occurs on all properties of a mix, then certain operations affecting all these 
properties are examined. This considerably restricts the number of possible factors. If the 
change is confined to one property, then the factors affecting that property are examined in the 
light of the answer to the first question. The procedure to be followed in the case of ^ch 
property is detailed in the Guide. Usually at this stage sufficient information has been obtained 
to restrict the possible causes to one or two. 

3. Three cases are involved in the third question, and each is examined in detail in the 
Guide, giving the causes indicated and the remedies to be applied for each contingency. 
For example, in the first case Average values are in control, but Range values fall outside the 
limits. Here the trouble is traced to variation in test values tending to balance one another, 
due to periodicity of operating variables. Factors of a periodic nature are listed for examina- 
tion, such as, variations due to the use of different cupolas on alternate days, change in 
operators as in shift work, melting carried out at different times — ^.g., beginning and end of 
the heat, for the same mix, etc. Similarly, causes and corrective treatment are given in the 
Guide for the other two cases. 

4. The fourth question — ^that of the nature of the change (gradual or abrupt) — is significant 
technically in that certain operating factors tend to produce a slow but cumulative effect, 
while others are immediate in their influence. For example, a change in coke quality tends to 
produce a gradually increasing (or decreasing) Total Carbon content in the metal extending 
over a long period. In such a case reference to the answers to questions 1 and 2 would confirm 
if this could be the factor responsible. 

5. The question regarding correlation between chemical and physical properties is aimed 
at checking the possibility of errors of measurement. The term is used to include sampling 
procedure, accuracy of chemical and physical testing, and generally any factor which may 
result in incorrect test data and lead to false conclusions. For example, if Total Carbon values 
fall, with Silicon remaining constant, lack of a related rise in Tensile would lead to a check 
on testing practice. Similarly, chemical properties in control, with Tensile out of control 
would indicate definitely that an abnormal condition existed in testing procedure and led to 
an investigation on this point. 

In a further section of the Guide a list is made of the main divisions of the melting process, 
called General Variables ; that is, factors which are fundamental to the process and on which con- 
trol and supervision are concentrated. Against each General Variable is listed all the operations 
which are involved in that particular basic factor. By this means, the actual operation responsible 
for trouble may be picked out of a large number of possible causes. 

(b) Operating Changes Chart, This is an addition to the main control chart, on which all 
changes of an operating nature arc recorded at the time they are made. It is placed below the 
main chart, with the group number and dates coinciding in a vertical line. For example, if a change 
is made in coke grading, it is recorded for reference at the date and group number in which the new 
coke is put into use. The effect of the change, if any, can then be traced on the main chart and, if 
necessary, correction made to hold the product to standard values. This has been found to reduce 
the work of searching for causes and avoids dependence on memory. 

(c) Rating of Standard of J\fetallurgical Control, One of the incidental advantages of this 
system of control is that it permits of applying a numerical rated value to the standard of control in 
each foundry. This is extremely useful, not only in assessing the progress made from one period to 
another for the same foundry, but also in comparing the standard of work from one foundry to 
another. 

The assessment is based on two factors : — 

1. The closeness of approach of the metal test values to Meehanite Standard values. 

2. The degree of consistency in these test values. 

For the first factor, the deviation of each of the group Averages, for the period under con- 
sideration, prom the standard Meehanite value for the property is calculated. The mean of these 
deviations is found and related to the standard permissible deviation for the property. A per- 



19461 llARDTSQ-Statistical Control Applied to High Duty Iron Production 243 

centage rating is then taken direct from a Rating Chart, on which zero deviation is given loo per 
cent, rating and deviation equal to the standard value given 75 per cent, rating. Other values are 
proportionately rated. 

For the second factor, the mean of the group Ranges is taken for the period and, from a second 
Rating chart, a percentage rating is obtained. This is based on 100 per cent, rating for zero mean 
Range and 75 per cent, rating for Range value equal to the standard value set for the property. 
Other values are proportionately rated. The final rating is the mean of the Group Deviation and 
Range ratings for the type or types of Meehanite made. 

Future Developments 

The statistical system of control described has so far been applied completely only to metal 
control in the foundry. It is hoped eventually to extend it to the final product, which is the casting 



5 io is 5o 55 30 S 5o 

Fig. 8. 

Showing results for compression strength in moulding sand. 


itself; but owing to difficulties in measuring the properties of the casting this has not yet been 
accomplished. In the meantime, a start has been made by extending the application of the system 
to other operations involved in casting production. Of these, control of moulding sand properties 
is next in importance to metal control and the system has recently been put into operation for 
moulding sand control. The results obtained so far have shown excellent promise, particularly 
when applied to mechanical sand preparation. One of the great advantages of the system for 
moulding sand control is the early detection of trends away from standard, which has always been 
a source of trouble in sand control. Fig. 8 illustrates the type of chart used and some typical results 
obtained for sand compression strength. 

It is hoped that this description of the application of Control Charts to cast iron production 
will encourage the use of this system of control for other metals and melting processes. 



244 


[No. 


Ultimate Risks in Sampling Inspection 

By Capt. a. H. R. Grimsey, R.A. {Military College of Science^ Blurton, Stoke-on-Trent) 

[Paper discussed before the London Group of the Industrial Applications Section of The Royal 
Statlstical SoaETY, November 24th, 1944.] 

Whenever the suggestion is made that examination of a manufactured product should be by some 
scheme of inspecting only a proportion of the output, the first query which is made is “ What 
risk am 1 undertaking?” This is the question which is put equally by producer, inspector and 
consumer, and unless it can be answered satisfactorily, then the proposed system of inspection will 
not be accepted, in spite of the economies which it may produce. 

It is, of course, quite apparent that if some material is to be accepted or rejected unseen as a 
result of examination, then there are risks to be taken. The sample inspected may not be repre- 
sentative of the bulk ; this is an important point, and one which is often overlooked in practice. 
No deduction concerning the bulk, based on the examination of such a sample, can have much 
validity. But even when the sample is chosen in such a way that it is thought to be as closely as 
possible equal to any other which might have been chosen (this defining a ” random sample”), 
there are inevitably vai iations from one sample to another. 

Thus from a batch of a certain quality, say 5 per cent, defective (/.^., i article in 20, on the aver- 
age, being a defect), then a random sample of 20 may contain o, i, 2, 3, ... defects. We may 
expect a sample to contain just i defect — this is the mathematical expectation, and it can be shown 
that more random samples will contain i defect than any other number of defects. Nevertheless, 
we may select a sample containing no defects at all. Such samples will not be rare occurrences, 
their relative frequency being calculable. Other samples will contain 2, 3, ... 20 defectives. 
The random sample containing 20 defects will certainly arise very rarely ; the chance of it happening 
can be shown to be i in 20*®. On the other hand, if such a sample is not randomly selected from 
the bulk, then it might be quite a common occurrence. 

The relative frequencies with which the various possibilities occur are, in fact, given by the 
following table : — 


No. of defectives 

Proportion of samples 

Percentage of samples 

0 defect 

19*® ~ 20*® 

35*85 

1 

2019>* ~ 20*® 

37*73 

2 or more defects 

2o*o_39.|9i# 20*® 

26*42 



100*00 


Figures in the final column are the percentage probabilities which can be interpreted as either 
the relative frequency of occurrence over a long period or the chance (measure of confidence) of 
that particular result occurring in one sample. 

This has been examined at length to show that a sample containing fewer defects than the 
average will occur quite often, and so will samples containing more. If the sampling scheme is 
based upon such an idea as ” accept only batches which produce samples containing not more 
than I defect,” then 26*42 per cent, of batches of quality 5 per cent, defective will be rejected on 
the average of a long run. Moreover, such rejection will be almost automatic over a long period, 
and there is no difference between the quality of those accepted or rejected according to this test, 
since all batches have been assumed to contain 5 per cent, defects. 

Therefore, if 5 per cent, is a permissible proportion defective, the producer of material of this 
quality stands the risk of having over a quarter of his output unjustifiably returned to him. On 
the other hand, if 5 per cent, defects be not permissible, then such a scheme would lead to the wrong- 
ful acceptance of about three-quarters of the batches offered, and this is clearly a grave risk to the 
consumer. 



i94SI 


245 


Grimsey — Ultimate Risks in Sampling Inspection 

The magnitude of such risks to both parties to any sampling scheme can be similarly evaluated 
for any assumed percentage of quality in the batches to be supplied. If a batch is condemned or 
approved as a result of one sample, this is known as a Single Sampling scheme. The following 
results apply to the scheme : “ Take a sample of three and test ; reject if one or more fails, accept 
if all pass,” when applied to batches of loo articles. 

... 0 10 20 30 40 50 60 70 80 90 100 

... 1 00 0-726 0-508 0-338 0-212 0-121 0-061 0025 0-007 0-001 0-000 

These results are illustrated in Fig. 1. 


% defective in batch 
Chance of acceptance 



I I 1 » 1 1 I » I I 

0 10 20 30 40 50 60 70 80 90 100 
Percentage Defective. 

Fig. 1. 


It may be desired to give the producer a greater margin by the modification of the scheme to 
read: ” Accept if all are sound out of a sample of three, reject if two or more are defective; if 
one is defective, take a further sample of five and accept if no defects then occur.” 

This scheme, with its first not-proven category and subsequent re-test, is typical of many Double 
Sampling schemes. Multiple or Sequential schemes are obtained by extending this idea to further 
re-testing. 

If the quality of the batch is known (i.e., the “ percentage defective”), then the chances of 
acceptance, rejection on first sample, and the necessity for re-test can all be calculated and plotted. 
(See Figs. 2a and 2b.) 



0 10 20 30 40 50 60 70 80 90 100 

Percentage Defective. 

Fig. 2a. 

Chance of Re-test base^ on 1 Defect in Sample 



Fig. 2b. 

Chance of Rejection based on 2 and 3 Defects in 
Sample of 3. 




246 OvamsY-^-Ultimate Risks in Sampling Inspection [No. 2^, 

These results for the first sample are most conveniently shown together in one chagram (Fig. 3) 
In this Figure the ordinate at any percentage defective is unity, but is divided mto three parts 
corresponding to the chance or relative frequency of accepting, re-testing, and rejecting repeated 
batches of each percentage defective. These are denoted : — 

L — Accept on 1st test. 

A/— Re-test on 1st test. 

N — Reject on 1st test. 



0 10 20 30 40 50 60 70 80 90 100 


Percentage Defective. 

Fig. 3. 

Illustrating Result of First Test in a Double Sampling Scheme. 

Thus, considering the ordinate represented by the broken line in Fig. 3, the portion in the area 
marked “ L ” is equal to the ordinate in Fig. 1 ; that portion in area marked “ Af ” is equal to 
the ordinate on Fig. 2a ; and that portion in area “ A^” is equal to the ordinate of Fig. 2A. 



0 10 20 30 40 50 60 70 80 90 100 
Percentage Defective. 


Sampling Scheme. 


First Sample 3 ; 


Accept if 0 defects. 

Retest sample of 5 if 1 defect. 
Reject if 2 or more defects. 


Second Sample 5 ; Accept only if 0 defects. 


Fio. 4. 

Illustrating Result of a Double Sampling Scheme. 

The examination of the second sample (of five) will result in the division of the re-test region into 
two : the whole figure then appears with one curve dividing the regions of acceptance and rejection 
(Fig. 4). 





1946] 


O^m^Y—Ultirnate Risks in Sampling Inspection 


247 


♦fc called • the “ Operating Characteristic of the Full Sampling Scheme ”• 

tte whole diagram details fully the results of the application of the scheme to a series of batches 
^1 at the same level of quahty. To avoid confusion, this diagram will be caUed in this paper the 
S^phng Diagr^ and the dividing full-line curve the “ SampUng Characteristic ’ ’ of the ^me 
ine Sampling Characteristic of any scheme can be calculated and drawn. 

• It is usual for the consumer to lay down specification limits or tolerances and to 

institute a system of inspection to confirm that these are being kept. If the article be a reallv 
critical on^ It may be important that no defective article be accepted. In such a case the consumer 
will probably call for loo per cent, inspection; however, this may be impossible because the test 
IS a destructive one— e.g., functioning of fuses, ultimate tensile strength, life of lamp filaments 
etc. ^me scheme of rational sampling is then essential. In less critical cases a sound scheme of 
selective inspection may be quite sufficient, with advantages over the more costlv comniete 
examination. ^ 



In examples where the highest level of quality is not essential, it is often agreed that it is preferable 
to ac^pt a large quantity of fairly good material rather than a smaller amount of perfect material. 
This is a common state of affairs, and represents a compromise between the consumers’ requirement 
for quantity and the manufacturers’ difficulties with precision work. But such a compromise must 
incorporate some control of quality, and it is usual for the consumer to set some limit to the per- 
centage defective. This has been called “ lot tolerance percentage defective ” by Dodge-Romig,t 
and more simply “ maximum allowable ” by Swan.J For example, the value of this maximum 
allowable (m.a.) percentage may be i per cent., and the consumer wilt then be satisfied if not more 
than I per cent, of a batch of articles is defective. 

To attain this, it is clear that in order to avoid frequent rejection the producer must work to an 
average quality level— a “ process average ” : t (p.a.) less than (m.a.). Later consideration will 
show that the ratio p.a./m.a. is of considerable importance in any sampling system, and should be 
as low as possible — a desirable but insufficient condition to the manufacturer. 

The Producer’s Risk is usually associated with the process average and the Consumer’s Risk 
is associated with the “ maximum allowable.” This is true only under idealized conditions, which 
do not arise in practice. It is shown below that the former is a useful approximation, but that the 
latter may be misleading. 

If the proportion defective in any batch of a manufactured article is exactly equal to the “ process 
average,” then the Sampling Characteristic shows the chance of rejection of such a batch. This 
chance or probability is the proportion of batches of this quality which would be rejected under 
the scheme. Further, this chance is a measure of, and can be called; the Producer’s Risk, tc, since 
all these batches, whether accepted or rejected, are of the same quality, and satisfactory to the 
consumer if fuller inspection were carried out. (Fig. 5.) 

* Sampling Inspection," by H. Rissik,’’ Aircraft Engineering, May 1943. 

I I, Tables,’’ Belt Telephone Technical Journal, January 1941. 

1943 Qualitative Inspection,” Institution of Mechanical Engineers, December 1 7th, 



248 


Orimsey— l////w«re Hisks in Sampling Inspection [No, 2, 


SimUarly, if the proportion defective in any batch of a manufactured article is just greater than 
the “maximum allowable,” then the Sampling Characteristic shows the chance of accepting 
such a batch. This chance or probability is the proportion of batches of this quality which would 
be accepted under the scheme, and clearly is a measure of, and can be called, the Consumer s 

Risk, y. 1- j 

It is important to note that the Producer’s Risk is tc only if all batches are of p.a. quahty, ana 
the Consumer’s Risk is y only if all batches submitted are just below m.a. quality. Now, these are 
unlikely conditions in practice. No process is likely to produce ten defects in every thousand. 
If the process average, determined over a long period, is indeed i per cent., then some batches of 
1,000 will contain lo defects, but others will contain 9, 8, 12, 7, etc., defects. 

In other words, from batch to batch there will always be an inevitable swing about the process- 
average value. This is inherent in any process, and requires greater investigation than it has 
hitherto received. 

The nature of this variation in quality from batch to batch about an average quality has an 
important influence on the choice of a sampling scheme. 

Most concerns will have sufficient data recorded to be able to investigate batch-to-batch quality 
variations. Thus, if batches consist of 1,000 articles, then past records will show the number of 
defective articles found in each batch. A histogram can then be constructed by compiling the results 
in the well-known “ cricket-score ” method. (See Fig. 6.) 

Thus a run of 5, 6. 3. 6, 7, 4, 9. 6, 6, 7, 3, 6, 5, 9, 7, 6, 10, 6, 5, 8, 7, 8, 5, 9, 6, 8, 7 defectives 

in successive batches gives the following : — 


Defectives 
in batch 

3 

4 

5 

6 

7 

8 
9 

10 


No. of batches 


a 
nil 

nil 

in 

1 


Totals : Batches — 28 
Defectives 182 

.*. Estimated process average 
(p.a.) - 182/28,000 - 0-65%. 


I 

(0 

PQ 

'S 


d 

Z 



No. of Defects in a Batch. 


Fig. 6. 

Process Characteristic. 


The greater the number of results available, the better the estimate of the variability in quality 
from batch to batch, provided that all the results used are associated with similar processing and> 
as far as possible, are representative of anticipated future production. If many results are available 
it may be possible to fit a curve to this histogram : such a curve has been called the Operating 
Characteristic, but a preferable name is the Process Characteristic. 

It should be remarked that the mathematics used for determining such a curve are not simple. 
The fitting of a curve, however, is not essential to the following argument. For the purpose of 
calculation, it is far simpler and better to treat results in the histogram or tabulated form. (See 
full stepped line in Fig. la!) • The term Process Characteristic may be applied to the smooth curve, 
or the more fundamental histogram. Either is truly characteristic of the quality of production and 
gives the distribution of defectives in batches. 

The chance of accepting or rejecting any batch submitted to the Sampling Scheme will depend 
on the proportion defective, and is given by the ordinate of the Sampling Characteristic appropriate 
to that proportion. To find the total chance of acceptance for the production represented by the 
Process Characteristic we must therefore add together, for all possible values of the percentage 
defective, the combined probability of : — 

{a) a batch having a percentage defective, (x), {— z corresponding to x, see Fig. la and lb) 
and 

{b) of accepting such a batch whenever it is offered to the Sampling Scheme (= y corre- 
sponding to X, sec Fig. 5). 



249 


1946] Grimsby— i/Z/Z/war/e Risks in Sampling Inspection 

This is, in effect, a weighting of the Sampling Characteristic by the Process Characteristic, 

The probability associated with (b) is given by the ordinate y of the Sampling Characteristic at 
any value of x (Fig. 5), and the probability associated with (a) is given by the ordinate z of the 
Process Characteristic at the same value of x, (See Fig. l\) The chance of accepting a batch of 
quality x is therefore the product yz, since z is the fraction of batches submitted and y the proportion 
of these accepted. 



012 3 4567 O M 

Percentage Defective. 1 2 3 4 5 6 7 

Percentage Defective. 


Fig. la. Fio- 7/^. 

Illustrating the Determination of Producer's and Consumer's Risks, for given Process and Sampling 

Characteristics. 

The summation of for all values of x divided by the summation of z gives the total proportion 
of the production which will be accepted ; the summation of (i ~ y)z similarly gives the proportion 
which will be rejected, since (i — y) is the chance of a batch of quality x being rejected. 

The value of y is usually obtained from a smooth curve. If the value of z be given as a table or 
as a histogram, the values of yz will plot in a stepped form (Fig. Id). On the other hand, if the 
value of z be plotted as a smooth curve, the products yz will plot as a smooth curve. (See Fig. lb.) 

In either case the area under the graph of yz represents the proportion of the production accepted, 
and the area between the graphs of yz and z represents the proportion rejected. 

If on these graphs the position of maximum allowable (M) is drawn as an ordinate, the whole 
output (represented by the total area under the z curve) is divided into four categories corresponding 
to the areas A, C, R and F, shown in Fig. lb, for the case of the smooth curves, but more simply 
calculated from the known true values. 

That part of the Process Characteristic with x less than OM, namely, A P, represents satis- 
factory product. Of this, A is correctly accepted by sampling and P wrongly rejected, because the 
broken curve is the boundary of acceptance and rejection. Thus, PI{A H- P + C l- /?) * could be 
called the Producer’s Risk. It should be noted, however, that of the total production represented 
by the Process Characteristic, the area F -f F represents product rejected. Hence the ratio 
{P -V R) HA P -\~ C + F) may be considered as the Ultimate Producer’s Risk for the production 
represented by the Process Characteristic. 

That part of the Process Characteristic with x greater than OM, namely, F 4- C, represents 
production which is unsatisfactory, and should be rejected. Nevertheless, C is accepted, and only 
R rejected ; there is therefore a proportion C wrongly passing the Sampling Test. This is of vital 
interest to the consumer, and this proportion CjiA + F f C + R) is logically termed the true or 
Ultimate Consumer’s Risk (UCR) for the production represented by the Process Characteristic 
considered. |The consumer is accepting a risk of a different kind, viz., the rejection of some 
satisfactory material.]! 

In practice the Process Characteristic (z curve) may not extend as far as .x -= OAf— /.e., no 

* iff the diagrams are prepared so as to have a total area unity, then the risks are more simply 
referred to in terms of the areas F, R, etc. 


250 


Grimsey — Ultimate Risks in Sampling Inspection [No. 2, 1946 

product is made worse than maximum allowable. Under this condition there can be no risk to the 
consumer and there should be no rejections. On the contrary, from an examination of the Samp- 
ling Characteristic alone there will be values of the so-called Producer’s Risk (tc) and Cbnsumer’s 
Risk (y) (Fig. 5). The value tt is somewhat in error, for it assumes no spread in quality from batch 
to batch about the process average, and the value y does not arise, for the simple reason that it is 
known that no batches are offered containing as many defectives as the maximum allowable. 

Investigation has shown that, in general, the effect of allowing for the inevitable variations in 
batch-to-batch quality is to change slightly the Producer’s Risk and to reduce considerably the 
Consumer’s Risk (which may in some cases be zero). This illustrates tha importance of the value of 
the ratio p.a./m.a. referred to earlier in the paper. 

It is urged that the simplicity by which I^ocess Characteristics or Data can be found from 
existing records should be utilized to investigate the characteristics of various processes. It is likely 
that certain mechanical processes will have their own types of characteristic, and there is need for 
investigation of this. '• 

The author has investigated the effect of Process Characteristics derived from the Poisson 
Distribution, which has some theoretical justiheation, but the main point emphasized here is that 
the principle is sound for any type of characteristic. The efforts of research workers could bene- 
ficially be directed to the determination of process characteristics in practical cases and to tracing 
the effects upon the whole question of economic Sampling Inspection schemes. 



INDEX 

TO THE Supplement to the Journal of the Royal Statistical Society 

VoL. VIII, 1946. 


PAGE 

Analysis of a Series of Experiments by the Use of Punched Cards. See Kempfhorne (O.). 

Anscombe (F. J.). Linear Sequential Rectifying Inspection for Controlling Fraction Defective 216 

Application of Some Commercial Calculating Machines to Certain Statistical Calculations. 

See Hartley (H. O.). 

Armttaoe (P.). See Stockman (C. M.) and Armitage (P.). 

Average Sampling Numbers from Finite Lots. See Vadja (S.). 

Barnard (G. A.). Sequential Tests in Industrial Statistics 1-21 

Simple inspection problem — the Inspection diagram ........ 3 

General sequential tests ............ 8 

General inspection problems ............ 10 

Discussion: Mr. Womersley; Dr. Bartlett; Dr. Vajda ; Mr. Tweedie; Mr. A. E. Jones ; Dr. Yates ; 

Mr. Anscombe ; Mr. Bosanquet ; Mr. Barnard in reply 22-26 

Bartlett (M. S.). Modified Probit Technique for Small Probabilities 113 

On the Theoretical Specification and Sampling Properties of Auto-correlated Time- 

series 27 

and Kendall (D. G.). The Statistical Analysis of Variance-heterogeneity and the 

Logarithmic Transformation 128 

Bayley (G. V.) and Hammersley (J. M.;. The “Effective” Number of Independent Obser- 
vations in an Auto-correlated Time-series 184 

Burman (J. P.). Sequential Sampling Formulae for a Binomial Population .... 98 

Cunningham (L. B. C.) and Hynd (W. R. B.). Random Processes in Problems of Air Warfare 62 

Distribution of the Sum of n Sample Values. See Francis (V. J.). 

“Effective” Number of Independent Observations in an Auto-correlated Time-series. See 
Bayley (G. V.) and Hammersley (J. M.). 

Foster (G. A. R.). Some Instruments for the Analysis of Time-series and T heir Application 

to Textile Research 42 

Francis (V, J.). On the Distribution of the Sum of n Sample Values Drawn from a Truncated 

Normal Population 223 

Grimsey (A. H. R.). Ultimate Risks in Sampling Inspection 244 

Hammersley (J. M.). See Bayley (G. V.) and Hammersley (J. M.). 

Harding (E. W.). Statistical Control Applied to High Duty Iron Production ... 233 

Hartley (H. O.). The Application of Some Commercial Calculating Machines to Certain 

Statistical Calculations 154-173 

Part T. Multi-variablo work moclianizcd ......... 155 

Part II. Miscellanoous applications of calculating machines ...... 105 

Discussion: Dr. Wishart; Mr. Feiller; Mr. Mandeville; Mr. J. Todd; Dr. H. G. Hudson; 

Mr. Hey; Dr. Comrie; Mr. Boss; Mr. Seal; Mr. Ineson; Dr. Booth; Dr. Hartley in reply 173-183 

Hynd (W. R. B.). See Cunningham (L. B. C.) and Hynd (W. R. B.). 

Kempthorne (O.). Analysis of a Series of Experiments by the Use of Punched Cards . 118 

Kendall (D. G.). See Bartlett (M. S.) and Kendall (D. G.).* 

Linba^' S equential Rectifying Inspection. See Anscombe (F. J.). 

Modified Probit Technique for Small Probabilities. See Bartlett (M. S.). 



2 


Random Processes in Problems of Air Warfare. See Cunningham (L. B. C.) and Hynd 
(W. R. B.). ... 

Richardson (J. T.). Table of Lagrangian Coefficients for Logarithmic Interpolation ot 
Standard Statistical Tables to Obtain Other Probability Levels 

Sequential Sampling Formulae for a Binomial Population. See Burman (J. P.). 

4 tests in Industrial Statistics. See Barnard (G. A.). 

Some Instruments for the Analysis of Time Scries. See Foster (G. A. R.). 

Properties of Closed Sequential Schemes. See Stockman (C. M.) and Armitage (P.). 

STATiSTtfAL Analysis of Variance-heterogeneity and the Logarithmic Transformation. See 
Bartlett (M. S.) and Kendall (D. G.). 

— - Control Applied to High Duty Iron Production. See Harding (E. W.). 

Methods in the Selection of Army and Navy Personnel. See Vernon (P. E.). 

Stockman (C. M.) and Armitage <P.). Some Properties of Closed Sequential Schemes 

Symposium on Auto-correlation Time Series 

Discussion: Mr. M. G. Kendall; Dr. H. E. Daniels; Dr. H. Jeffreys; Mr. J. E. Moyal; Major 
Hammersley; Dr. Hartley; Mr. Stone; F/Lt. Buckland; Dr. Spencer-Smith; Dr. Bartlett, 
Mr. Foster, Dr. Cunningham and Mr. Hynd in reply 

Table of Lagrangian Coefficients for Logarithmic Interpolation. See Richardson (J. T.). 

Theoretical Specification and Sampling Properties of Auto-correlated Time-series. See 
Bartlett (M. S.). 

Ultimate Risks in Sampling Inspection. See Grimsey (A. H. R.). 

• Use of the Negative Binomial Distribution in an Industrial Sampling Problem. See Wise (M. E.). 


212 . 

104 

27-85 

85-97 


Vadja (S.). Average Sampling Numbers from Finite Lots 198 

Vernon (P. E.). Statistical Methods in the Selection of Army and Navy Personnel . 139-148 

I’ypcH ot dftttt ........ ..... 140 

• Uorrelatluual tochulQuoH ........... 142 

Factor analyslH . ........... 144 

.tTiulyHis of variance .... . . 14« 

Discussion: Prof. Burt; Dr. Fraser Roberts; Mr. Slater; Dr. Irwin; Sergeant Kinsman; Mr. 

Kendall; Mr. Bains; Dr. Vernon in reply 148-153 


Wise (M. E.). Use of the Negative Binomial Distribution in an Industrial Sampling Problem 


202 



SUPPLEMENT TO THE JOURNAL 


OF THE 

ROYAL 

STATISTICAL SOCIETY 


Founded 1834 

Incorporated by Royal Charter 1 887 


Vol. IX. -Nos. 1—2, 1947 


LONDON : 

ROYAL STATISTICAL SOCIETY 
4, PORTUGAL STREET, W.C.2 
(1947) 



CONTENTS 


VOL. IX.- NO. I, 1947 


On ihc Interdependence of Blocks of Transactions. By Richard Stone 

Discussion on the Paper 

The Principles of Biological Assay. By D. J. Finney 

Discussion on the Paper 

The Random Division of an Interval. By P. A. P. Moran 

The Significance of Associations in a Square Point I attice. By D. J. Finney 

The Oscillatory Properties of the Moving Average. By J. L. Spencer-Smhh . 

The Factor Analysis of a Matrix of 2 ^ 2 Tables. By Patrick Slater . 

I actorial Experiments Deriv^^blc from C'ombinatorial Arrangements of Arrays. By C". R\dha 
KRISHNA Rao 

Exhibition of Mechanical Aids to Statistical Computation ...... 


VOL. ix.— No. IL 1947 

Statistical Imesiig.ition of Casualties Suffered by (crlain lypes of Vessels By S. V\jd\ 
Discussion on the Paper ... ... 

Multivariate .Analysis. By M. S. B\rii n r 
Discussion on ihe Paper 

Methods of Dcferied Sentencing in fcsling the liaLtion Detective of a Continuous Output. 
By F. J. Ans( ombe, IT. J. CioinviN and R. I . Pi \c m 1 1 

Regression I ines and (he L incai I unctional Relationship. By D V. Lindiiy 

Cirouping C orrections for lligli .Autocorrelations. By H. F. Dwins 

Some Seiiucnliiil I esls of Student's Hypothesis. By P AR\iii\(,r 

Index to Vol. 1\ (1947) ..... 


PAGES 

1-32 
32-45 
46-81 
81-91 
92-98 
99-103 
104 113 
114 127 

128-139 

140 


PAGE.S 

141 -163 
164-175 
176 190 
190-197 

198-217 
218-244 
245-249 
250 263 


264 



SUPPLEMENT TO THE 

Journal of the Royal Statistical Society 

Vol. IX, No. 1. 1947 

On the Interdependence of Blocks of Transactions* 

By Richard Stone 

[Read before the Research Section of the Royal Statistical Society, December 5, 1946, 

Dr. J. WiSHART in the Chair] 

1. Introduction 

It is a common experience of investigators in applied economics that the amount of information 
available to test any particular theory is limited, and does not provide the variety of experience 
necessary for deciding a point at issue. For historical reasons data of the required degree of com- 
plexity and reliability are frequently available only for a comparatively short period, though 
much energy is nowadays devoted to extending the scope of such data. In a short period, say 
twenty years, an economic system may not perform crucial experiments of the type needed for 
testing particular hypotheses; the similarity of the movements exhibited by different parts of the 
economy may greatly restrict the conclusions which may validly be drawn from a given body of 
data. This state of affairs is manifested in the fact that the variation of a particular variable, 
say the quantity of a given commodity consumed, can often be explained with only a few of the 
predictors that would seem to be necessary on theoretical grounds. These predictors could, 
conceptually, have considerable independent fluctuations, but in fact within the available short 
experience they do not; to a close degree of approximation, unexpectedly simple relationships 
subsist between them. It will be useful to have a technique for uncovering these short-run 
relationships, whether they arc the expression of general economic or statistical laws, or explain- 
able short-run tendencies or of anything else. 

This particular small sample difficulty is nowhere more important than in the field of macro- 
economics, that is to say that part of economics which seeks to explain the variations in total 
activity as opposed to variations in the behaviour of a single entity. In such investigations 
the variables in our systems of stochastic simultaneous equations consist of large blocks of trans- 
actions, such as the national income, consumers’ expenditure, capital formation and the like, 
and other average or aggregate scries, such as the level of prices, the rate of interest and the stock 
of capital equipment of all kinds. Some further subdivision may be adopted; for example, 
income may be subdivided by type of income payment or capital formation may be broken down 
into producers’ durable equipment, building and the net change in inventories, but even so, the 
variables are aggregates or averages in which much of the individuality of the component series 
is lost. 

Instead of simply noting the interdependence of this kind of variable we may select a set of such 
variables for analysis, and see how far it is possible to reconstruct their movement from a small 
number of common factors. These factors are not themselves observed, but are constructed with 
a view to explaining the observed variables. If we have // variables and m factors and assume 
linear relationships, this hypothesis is expressed in a system of n equations of the form — 

Xj^^'^ajsFg ( 1 ) 

s 

where the Xj, j = 1, . . . are the observed variables and the F«, s = I, . . . are 
the hypothetical factors. If there, is a high degree of intercorrelation in the movement of the 
variables, we may expect to be able to explain most of the variance of the x^ in terms of a number 
^ factors which is small compared with the number of variables. 

; This position may be expressed differently as follows : Suppose m = 1 ; then all the x^ would 
mbve in exactly the same way and could be classified by one criterion of classification, the ampli- 

/ * I should like to record my indebtedness to my friend, H. R. Fisher, who was kind enough to read 
mis paper. in draft and suggest a number of improvements. In particular he suggested the construction 
tlf a spherical correlation map, which seems to me to bring out so clearly the essential features of factor 
analysis. 

SUPP. VOL. IX. NO. 1. 


B 



2 


Stone — On the Interdependence 


[Na 1, 


tude of their movement. If the Xj required two (independent) factors for their explanation we 
could not order them by one criterion of classification ; wc should need two independent criteria. 
In this case the Xj might move very differently, since in any given Xj the weights of the two factors 
(the Cjg) mi^t be very different. 

The statistical technique of factor analysis is designed to deal with precisely this problem. This 
method has been developed and applied largely by psychologists for analysing the structure of 
abilities reflected in batteries of mental tests, and a number of different procedures have been put 
forward and applied by different investigators. These differences are largely attributable to 
alternative views of the kind of ^structure of hypothetical mental abilities that different investi- 
gators consider to be reasonable on theoretical grounds. 

The method used here is due to Hotelling,* and has the advantages first that the hypothetical 
variables or factors in terms of which the observed variables arc explained are orthogonal, and 
second that each factor accounts for as much of the remaining total variance of the observed 
variables as possible. In principle n variables observed on a large number of occasions can be 
explained completely (in the absence of exact linear interdependence between them) in terms of 
just n factors, but where the intercorrclations of the variables are high most of this combined 
variance can be explained by a number of factors which is ".mail compared with n. 

In geometrical terms the problem may be set out briefly as follows : our data consist of measure- 
ments in the form of variations from means of n transaction blocks for each of N years, which 
the reader accustonied to psychological applications may think of as tests and persons respec- 
tively. We set up orthogonal axes along which the transaction blocks are measured and in the 
resulting w-space each year may be represented by a point. Let us take the case where /i = 3 
and consider the configuiation of the N points in the three-dimensional transaction space. If we 
express the deviations in standard form, then with independent transaction blocks each dis- 
tributed according to the normal law, the points representing the N observations will tend 
as N increases to take the form of a spherically symmetrical swarm with density highest at the 
intersection of the axes, i.e. at the mean values of the transaction blocks and falling off evenly in 
all directions from the centre. If the transaction blocks arc correlated the spherical distribution 
will be replaced by an ellipsoidal one. And, to take an extreme case, if the variables could be 
represented exactly by a single factor Fi, the swarm would be confined to a straight line, for the 
co-ordinates of a.point of the swarm, being r/uF, would always be in the proportions 

: a^x : t/.n. Along the radius vector defined by these proportions the N points would be spaced 
according to the corresponding values of F,. If now most, but not all of the variation of the obser- 
vations is explainable by a single factor, the swarm is elongated but not purely linear ; and the 
projections of the N points on to a line forming the core of the swarm constitute the first factor 
representations of the data. In Hotelling’s method this core is chosen so as to minimize the sum 
of squares of residues, and the choice is found to fall on the first principal axis of a certain ellipsoid. 
Let now the variation due to this factor be removed. Points corresponding to the residues form 
the two-dimensional swarm got by collapsing the three-dimensional swarm in the direction of the 
first principal axis. Variability within the residues may be chiefly describable by distance along 
the core of this two-dimensional swarm, which, located by a similar least-squares method, is 
the second principal axis of the original ellipsoid. If the two-dimensional swarm is collapsed 
parallel to this axis there is left the remaining variability concentrated along the smallest principal 
axis of the ellipsoid ; this corresponds to a third factor which exhausts the data. Corresponding 
to the orthogonality of the axes there are certain algebraical relations of orthogonality between 
the coefficients also the product moment of the swarm referred to any pair of the principal 
axes is zero (the principal fiictor axes being in fact the same as the principal axes of inertia of the 
swarm), and so the factor values, being proportional to the co-ordinates of the N points with 
respect to the principal axes, are uncorrelated, or orthogonal. 

This geometrical representation is merely an extension of the familiar correlation scatter 
diagram. While convenient in introducing the subject, it has not the analytic power of the 


V, r ^ pf a Complex of Statistical Variables into Principal Components,” by H. Hotelling 

in Jownol oj Educational Psychology^ 1933, pp. 417-41 and 498—520. Descriptions of this method can be 
found Factor Analysis (1941) by Holzin^r and Harman, especially in Chapter VII and Ap^nd?" D 
7*of theniafSaUcif ap^^ (1939) by G. H. Thomson, especially in Chapter V and Note 



of Blocks of Transactions 


Diagram 1 

Employees’ Compensation 






Stone— 0/1 the Interdependence 


[No. 1, 


obverse representation, more prominently used in works on factor analysis, 'if we 

be one point for each of the n transaction blocks subsisting in a space of N ^ 

consider only normalized variables, i.c. those with sum of squares equal to unity, it can be seen ttiat 


Diagram 2 

Consumers’ Perishable Goods plus Producers’ Durable Goods 



each variable will lie on the surface of an AT-sphere, and that each point on this surface will corre- 
spond to a possible mode of variation over the period of N time units. If the observed variations 
of the n variables can be represented approximately in terms of m factors, it will be possible to 
find an /w-spherc the surface of which approximately contains the n points. If, as in the present 
case, the set of variables is nearly describable by three factors, it is possible to represent them on 
the surface of a sphere. This representation (a model of which will be available for inspection 




1947] 


of Blocks of Transactions 


5 


when this paper is read) shows conveniently the intercorrelations of the variables (or rather their 
three-factor representations), the affinities of these representations with the three principal or 
other factors and the possible good regression equations between them. 

The sphere is imagined embedded in AT-space, and to each of the n variables there corresponds 
a point on the surface of the sphere indicating the three-factor representation of the variable. 

Diagram 3 

Unadjusted Net Savings of Enterprises plus Adjustment for Capital Revaluation 


1^20 iqiS l<130 





6 


Stone — On the Interdependence 


[No. 1, 


These points are the radial projections of the projections into 3-space of the true points in A/’-space. 
The distances between the projections of each variable on to the unit A^-sphere and those on to the 
unit 3-sphere are proportional to \/2(l — Rj . gt . . .% and are represented by circles with this 
radius drawn round the points representing the variables on the 3-sphere. 

Diagram 4 

Consumers’ Semi-durable Goods plus Consumers’ Durable Goods 



3-sphcre may also contain three orthogonal points representing the first 
three factors or principal components of variation of the variables; indeed it is with res^tfn 
position of the w points representing the variables is determined'^lt is 
also possible to put on to the surface of the sphere the three-factor representation of anv othir 
variables, for example, those which are thought to be useful in explaining the observed variation 




1947] 


7 


of Blocks of Transactions 

.of the n variables. It will be seen in this example that the point corresponding to the three-factor 
representation of total income over the period lies very close to the point representing the first 
factor Fi. This indicates that total income and fi were highly correlated over the period. 

Diagram 5 
Consumers’ Services 



In imny psychological applications of factor analysis the calculations are made with stan- 
dardized variables, i.e. correlation coefficients rather than sums of squares and products form the 
basis of the analysis. The main reason for this is the difficulty of finding a common unit of 
measurement in which to express the scores in the different tests. This problem is not present in 



8 Stone — On the Interdependence 

the application here since all the transaction blocks are measured in a common unit, money. 
Accordingly no attempt is made here to standardize the observed variables. . 

The use of sums of squares and products does not complicate the analysis and indeed simplifies 
the arithmetic, but it docs have the following effect: The variances of the seventeen observed 

Diagram 6. 

Construction. 



laio >^25 »CI4C'* 


variables differ widely, so that a large part of the combined variance may be explained by a fac- 
torial representation in which very little of the variance of the smaller variables is explained. In 
fact this does not occur to any important extent in the present case. On the other hand, if ail the 
variables were standardized, the very small ones would play as important a part in determining 
the outcome of the analysis as would the large ones. This is especially undesirable where, as in 
the present case, the particular transaction blocks used depend on the way in which the available 




1947] 


of Blocks of Transactions 


9 


series are presented rather than upon a grouping designed on theoretical grounds for this par- 
ticular type of analysis. In some respects this application is rather like the earlier work in 
psychology in which such test material as came to hand was subjected to analysis. In more 
recent psychological studies the test material is carefully selected so that the varied activities of the 

Diagram 7 
Net Public Outlay 


IC12& IQJO 1^4447 



mind may be fairly represented. The corresponding refinement in the present case would consist 
in a careful a priori grouping of the transactions. In addition it would be appropriate to weight 
the data by expressing each transaction block in units of its estimated margin of error. For the 
approximate factorial representation is found by minimizing the sum of squares of residues, and 
if the analysis is taken far enough the residues will be largely drawn from the errors of estimation. 




10 


Stone — On the Interdependence 


[No. 1, 


From examination of the manner in which the data have been compiled one may have legitimate 
prejudices on the relative sizes of the residues that may be ascribed to errors (or to accurately 
recorded but irrelevant fluctuations) in the different transaction-blocks. It is a distortion of 


Diagram 8 

Net Increase in Inventories 



purpose to “chase” an unreliable variable as strongly as one with a smaller absolute marcin nf 
error. On the other hand if a few factors suffice to catch all the variables nearly enough weiSinf 




1947] of Blocks of TransacUons 1 1 

In this investigation an answer based on an examination of data for the United States over 
the years 1922-38 will be given to the following questions: 

First, how far can the variance of a number of transaction>blocks be satisfactorily explained 
in terms of a small number of factors? 

Second, if it is found that most of the total variance can be explained by a few factors, how far 
does this satisfactory result extend to the individual variances of the observable variables? Specifi- 
cally, how far are the variables with small variances adequately explained? 

Third, is it possible to identify the factors with series estimated from economic considerations 
and not from the factor-analysis itself? 

Finally, something will be said on the broader possibilities of factor analysis in relation to 
econometric work. 


2. The Data 

This analysis is based on Messrs. Kuznets' and Barger's data* for the components of total 
income and outlay of the United States over the seventeen years 1922-38. For another purposef 
these components had, subject to a small amount of prior aggregation, been correlated by pairs 
and amalgamated where the correlation coefficient exceeded 0-95. As a result the sums of squares 
and products of the series shown in Table I were available. 


Table L— 

-Components of Income and Outlay in the United States of America 

1922-1938 


Sign in 


StHiidard 

Variable. 

Income or 
Outlay. 

Component. 

(milliarda of 
dollars). 

1 

i 

Employees’ compensation 

6*16 

2 

0 

Consumers’ perishable goods plus producers’ durable 
goods 

4-40 

3 . . 

i 

Unadjusted net savings of enterprises plus adjust- 
ment for capital revaluation ..... 

3-95 

4 

o 

, Consumers’ semi-durable goods plus consumers’ 
durable goods 

3-70 

5 

0 

. Consumers’ services 

2*86 

6 

0 

. Construction 

2-80 

7 

o 

Net public outlay 

* 1*65 

8 

o 

. Net increase in inventories ..... 

1*64 

9 

i 

. Inventory revaluation adjustment .... 

1-56 

10 

i 

. Net rent received by individuals .... 

1-41 

11 

i 

. Entrepreneurial withdrawals 

1*36 

12 

i 

. Dividends 

107 

13 

— o 

. Adjustment for depreciation 

0*90 

14 

i 

. Interest 

0*51 

15 

i 

. Dividends, interest and non-commercial remittances 
from abroad less direct taxes paid by individuals 
plus veterans’ bonus plus social security benefits 
less employees’ social security contributions 

0-51 

16 

— i 

. Adjustment for depreciation and depletion . 

0*35 

17 

o 

. Foreign balance including foreign tourist expenditure 
in the United States ...... 

0-33 


* See Outlay and Income in the United States 1921-38 (1942), by II. Barger, Tabic I, pp. 42-3, Table 
III, pp, 50-1, Table IV, pp. 58-9 and Table V, pp. 62-3. As can be seen from the footnote to Table IV, all 
the components of income apart from some small adjusting items are taken from Kuznets' work. See 
National Income and its Composition (1941). 

t An application of H. R. Fisher’s method of adjusting the components of the national accounts when 
systematic errors are believed to be present. This method and the numerical example referred to here 
will shortly appear in The Review of Economic Studies. 



Table n. — Sums of Squares and Products of Deviations from Means 


12 


1 M 


Stone — the Interdependence 


[No. 1, 


if 5 “M ® S 
W -i <? 


I-. I - OO -t« ‘-'5 O OO 04 «C t- OO W W ->1 

ec « 7 1 O CO CO X' lO 5 I-O r-( >110 W •'5 

■^Oj.-HeocoW'i'Occcci't'.oiOicoOOi 
o-<«ci70-t‘co®«^^co7ir-o«orj 

I l_ t'_l_ 1 I 

■ CO © OI © 04 © » O -♦< ^ 

«-‘'MiO©'-<COO©l-©iO 
'«1<©XOI»-.TO'-^^I-COX»-| 

X oi X X oi t‘- te o CO lo CO 

777777 I I I I I 

xcoocco-foia^ioaco© 
S»oco-*>»t<i'-»o©©oi«i- 
04 Ol 04 OIiO I'.l' I' r- -ct -*< I'. Ol 
'f** ^ Ol I - •rj X ® X CO O ® iO 

lO O •-( © X >0 C© O 01 »»> 01 CO 
oi OI « I I I I i I I 

I I I I I I I I I I I 

3 f©©©eo’0'©»ftt-»i7©f>. 
|,..j,^-t«CO.-H©©©-t'— 'CO 

S I oil- x » « CO I ^ OI CO © <0 r-l 
i 'O' ® l-rH CO OI OI © CO 1-0 !'■ I'i 

<b © © .^ X X 01 01 CO 01 

I I 1 I I 

lO «0 X ^ X © f »>. © 01 -“t lO »o' 
-♦•co©.-*o^loe^^-fxco«-'0^© 
eori^0'0«io©c0".fw»o©»ox 
r - 01 ‘O f-i -CO I'- .o © © I-* oj CO 

x»-coxoiflOi’»©cofiO©'^eo 

X*0 01-*»»fC0rHFHf-<.-<.H,-(iH 

I I I I I I Mill 

.1 © CO -♦< © © o 01*»0 X © X 
^ ^ -t ©-♦•<< 01 © lO 

© -♦• 01 OI lO -^ » X -f I'.x lO 
1-01© CO CO -f© .-< © -C'O -f . 

.^©•4'OI'^©f^t^*I»iC«© * 

© I- 01 »0 ■f 01 ^ i-c r-l .-H r-( 1 -H 

I _ 

X— '•^©•^©-HOIXXI^ 

-f — I- lO © CO 1- r- .-I © CO 
©©©^©©iO©COI-X 
eo^ioci-c«ioci-w . 


§ © © I- f 01 

r-C'OOX 

onor- — 01 

OKI-'® C •« 


-»0 -HOI 

xx©o 

I- f-(0ll- 


I 

5 ?o© 


f i-xx« 

©X Ol-f 

I- ©to to©© 

iQ©©COCOO 
© eocool -f o< 
I 

UO-f ^coxt- 

1 - 01 1 -Xt-iO 

-cxx ©-'*-« 

© © .-I © oi © 


Ol- 


loS? CO . 


I ' 

1-0 

to to 

I 


to © 1 - © —t 1 - U 

©CO'-t^'l-—!- 
© X © O © lO c 
I- ^ CO OI <-• t< 


0 © lO I- © © ' 

Y I I I I • 

*-H©©--r.- Tto 

1 -c oi © © X Q 

© © -tf © I- ^ 

01 (N CO 00 1- 1- 



'te<l©' 4 «lO©^-x©o•-<olcO'<(tt©©^ 


<-tO© 01 C 0 -l'-*'l-t- 00 ©© 30 M^ 

TO S to O O cc ^ 35 

i” 5 Sw$©S«®©co: 3 ;co«oo 

,- 40 »-H 01 CO-Ci .0 50 -h©-C ©©©0 

©©oooooooc-ooooc 

I l_ IJ_ 

I ©Ol-t©C 0 XCO©^^©©Xl-S 
— © lO -c 1 - OJ © '-' c© tO © © OJ 0 
X©©CO©l-Ol©X©©®©O 0 i' 

t - 1 «• X X © I'- 1 - © O 1 — 1 - ^ 

©6©©©©o©ooo©o©c 
I I I I I I I I I I I 


r- © I- 1 - X CO © 1 - CO oi r- CO lO © © 

©-Hoor-cox.oxoo^coo^* 

XcooicO”i*'C‘©”fOii-H©c©-r,^ 

»i*iosococooi-i‘co©oi©eo-c*-'© 

©©©©©OOOOOOOOO-H 

j L I I I I _ jjj .11 

Ol CO © X --I -t* CO © I- © © © o 

0-h-»'©OI©©<M©01CO©©© 
iP 5 p^-M©C 0 ©X©©©'-<©‘ft© . 

-.tOXOWOOCOtO'-HOI-CCO© . 

©©©©©©©©©©©OOiH 

I I I I I I 


1 - CO X © I- OI © »o -C 1 - © 
-f CO © © -C'O OJ OI CO © © 
-It © X to 1 - X © OI 1 - -C lO I- 
© X CO it © 1 >- !>• -I* lO © © 

©©©©o©oo©©©i 

I I I I II I I I I I 


•as 


I 

.1 

I 

I 


£2 * 


03 


X -cox I- to ©01 

©i-eoi-*oi-©t-© 0 © 

© X CO t- X lO CO © -It -f I'- 

©o©©©cooo©© 


t- X to X *0 1- 1- O © -Ct c 

© -^ I- CO 01 -< X OI © CO C 
©©I- — X-t(M^0 01C 
©X-C©©©X-|tiOXC 


S : : 


©©©©©©©©©©--I 


---fl©l-t-r-ix-t©© 

© >0 to CO ft X OI -It ^ 

CO-HW-tW©©©©© 

©to© xi-woseot-tc 

0000 ©© 0 © 0 i^ 


to CO ft 01 ft X X CO © 
01 cor- ©©ft ©too 
©(ei--ci-xco©c 

1 * OI CO 01 tO 01 OI © © 

© 0 <© 00 ©© 0 .H 

I I I 


©X© 0 
ft I- to - . 
© © © © c 


^ © 0 © 00 f 


©©oc©©- 
M I I I I 


©-*•© ' 01-5 


© ©0©©ft 


OI-.©XC 
© X CO x c 

©0©©f 

©I-© © 
tH*©©© 
OI© -f © 
© X I- © 


H(Neo'tftio©i-x©Of^evix-jtirt©i- 



1947] 


of Blocks of Transactions 


13 


The components of income and outlay were independently estimated, and over the period total 
outlay was on the average about 3 4 per cent, greater than the income total. For the present 
purpose no attempt has been made to adjust the components so as to remove this discrepancy. 
Denoting the sums of squares and products of any pair of scries Xj and xj. by nijk we have — 

nijk - ^x^Xk - - A}) (Xk ~ Xk) 

and these values are shown in Table II, while the correlation coefficients, not used in the analysis 
but of interest in themselves, are shown in Table TIL 

The matrix given in Table II, which will be denoted by m — [rnjk\ forms the basis of the sub- 
sequent analysis as indicated in the following section which, while not intended as a complete 
or rigorous exposition of the Principal Factor method, will introduce the necessary algebraic 
conceptions. 


3. Method and Results 

In the first-factor representation of the data it is desired that each as it ranges through time, 
shall be as nearly as possible reproduced by r/jiF,, where the coefficients Ujy are constants, one for 
each variable, and Fi is the factor varying through time. The a^i’s and the range of values of Fi 
are to be chosen so that the sum of squares of residues for all variables through time, viz. 

- a,,F,y 
j t 

is a minimum; but since the coefficients and the factor values that produce the minimum arc 
indeterminate to the extent of a constant that may multiply the w^/s and divide the values of F^ 
the condition is imposed that F^^ shall have unit sum, that is the factor is sought in normalized 
form. The sets of minimizing equations with respect first to each and then with respect to 
each value of Fi in the time series take the forms — 

^x^Fr ( 2 ) 

t 

and X,Fi ^OfiX^ . . . . . . . (3) 

J 

where is some constant, being plus the indeterminate multiplier introduced into the 

j 

minimizing equations by the condition normalizing Fy, Actually this multiplier is zero, for on 
multiplying (2) by Uji and using (3) we find that — 

j fJ 

- ~ x^ (4) 

t 

Equation (2) means that the a^i’s are the regression coefficients of the x/s on F,, and that the residues 
(Xj — UjiFy) arc for each j orthogonal to Fy. Equation (3) means that Fy is a linear function of 
the XjS, the coefficients being proportional to the same quantities The value of X, appears 
when Fy is eliminated between (2) and (3), the result being the set of u equations — 

\ajy = y^^XjUkxXk 

t j 

- l^^akytrijk ( 5 ) 

j 

Elimination of the a\ shows that X^ must be a latent root of the matrix m. When X^ has its proper 
value, equations ( 5 ) are consistent and determine the ratios between the wji’s, the values of which 
are then fixed by ( 4 ), save for an ambiguity of sign, which may be settled arbitrarily. 

The sum of squares of residues, SSCJCj -ajiFO* is less than by ^r?jl^thatis by Xi, 

t j t j j 

and therefore X^ is chosen to be the largest of the latent roots, which are necessarily all real and non- 
negative. 



14 


Stone — On the Interdependence 


[No. 1 


A second factor F, may now be taken out in the same way from the first residues. F, will be 
a linear function of these residues, and therefore orthogonal to F,. Consequently which by 
analogy with ( 2 ) is S (xj — ajiF,)F 2 , reduces to Sx^F*, and so — 
t t 


- :S.Y.aj,xjF, ---- 2FF, - 0 (6) 

J j t t 

The second residues are orthogonal to Fa (as the first were to Fi) and to Fu because they are 
linear functions of the first residues. Therefore when Fa is found as a linear function of the second 
residues it too is orthogonal to both Fg and Fi. In this way it appears that each principal factor 
is orthogonal to the rest. Also, by repetition of the argument leading to ( 6 ) all the sets of coeffi- 
cients are mutually orthogonal in the sense that — 


- 0 when r s 

j 

= when r -- 5 . . . • • (7) 

In the above process \ is the largest latent root of the first residual matrix, of which the typical 
element is — 

^{Xj - Gj.F,) {Xk ~ Ok^F,) - mjk - Ojyaki ; .... ( 8 ) 

t 


and Xa is the largest root of the second residual matrix, and so on. It may be shown that Xg, 
Xa, etc., are respectively the second, third, etc., largest latent roots of m. This is hardly of practical 
interest since the computor finds X^, X 3 , etc., via the appropriate residual matrices,* but it shows 
how the principal factors exhaust the data. Each factor F^ can be used to take X^^ from the sum 
of squares, and the characteristic equation for X is so formed that I^X^ Thus 

•y j t j 


each of the n equations of the type— 


Xj = UjiFi + OjiF^ -f f/jaFa + (9) 

is exact if all w n factors are included, and is an approximation— a regression equation — if the 
earlier factors alone are used. Equations such as (3), giving a factor value in terms of the data, 
are exact. 


The coefficients of the factor pattern, the for the first three factors arc shown below in 
Table IV, together with the actual sums of squares, the calculated values, is ^ 1 to 3), 
5 — 3 t s 

and the variance ratios, a^jj^x^j ... . The final row of the table shows thepro- 

^ I t 

portion of the combined original variance explained by each of the factors, i.c. - 

j tj 


In Table V the pattern coefficients Ojg are divided through by the appropriate '\/nijj yielding 
the correlations, rj„ between the variables and the factors. 

Regression equations of the form of (9), the coefficients of which are shown in the above table, 
are illustrated in Diagrams 1-17. In these are shown the actual and calculated values of each Xp 
the discrepancy between the two values and the contribution of each of the three factors to the 
calculated value. 

As already mentioned, equations of the form of (3) above provide exact estimates of the Fg 
in terms of the Xj. In Table VI arc shown the normalized factors calculated in this way. In each 
case £F,* = 1 . 
t 


* For a practical method of computation see Holzinger and Harman, op. cit., Appendix D, and Thomson, 
op, cU.t Chapter V, 



1947 ] 


of Blocks of Transactions 


15 


Table IV. — Pattern Coefficients, Actual and Calculated Sums of Squares and Variance Ratios 


Variable. 


Pattern CoefBcienta multiplying P,, 

F, and 


Sums of Squares. 


h 


V- 


%* 

Original. 

('’aloulated. 

Ratio. 

1 


24*9622 . 

-3*8396 . 

-2*3958 

644*6372 

. 643*5937 

0*9984 

2 


17*2581 . 

-0*5389 . 

-5*2854 

328*4562 

. 326*0687 

0*9927 

3 


11*5443 . 

11*4291 . 

-0*3839 

265*5798 

. 264*0424 

0*9942 

4 


14*8296 . 

1*4191 . 

2*6247 

232*4089 

. 228*8191 

0*9846 

5 


10 7049 . 

-4*1033 . 

2*4457 

138*6140 

. 137*4138 

0*9913 

6 


9*8883 . 

0*8251 . 

5*8749 

133*7405 

. 132*9730 

0*9943 

7 


-4*8183 . 

-0*2970 . 

-3*9568 

46*5015 

38*9605 

0*8378 

8 


4*3872 . 

2*1437 . 

-2*2301 

45*5014 

. 28*8161 

0*6333 

9 


1*9175 . 

-5*4593 . 

1*0209 

41*3430 

34*5229 

0*8350 

10 


4*2928 . 

1*0580 . 

3*6537 

33*7052 

32*8966 

0*9760 

11 


5*1436 . 

-1*3238 . 

1*6324 

31*3837 

30*8737 

0*9837 

12 


3*7408 . 

-1*6395 . 

-1*0711 

19*4518 

17*8286 

0*9166 

13 


3*3644 . 

-1*3508 . 

0*3001 

13*6865 

13*2336 

0*9669 

14 


0*0002 . 

-1*8108 . 

0*0167 

4*3806 

3*2794 

0*7486 

15 


-0*9592 . 

-0*0186 . 

0*4839 

4*3639 

1*1546 

0*2646 

16 


1*2068 . 

0*4268 . 

0*1230 

2*0576 

1*6536 

0*8037 

17 


0*2907 . 

-0*0981 . 

0*8570 

1*8912 

0*8286 

0*4381 

\ 


1605*3132 , 

210*5318 . 

121*1148 

1987*7030 

. 1936*9589 

0*9745 



0*8076 , 

0*1059 . 

0*0609 

1*0000 

0*9745 

0*9745 


Table V. — Correlation Coefficients Between the Variables and the Factors 


’aiiable. 

1 . 


2 


3. 

1 

0*983 


- 0*151 


- 0*094 

2 

0*952 


- 0*030 


- 0*292 

3 

0*708 


0*701 


- 0*024 

4 

0*973 


0*093 


0*172 

5 

0*909 


- 0*349 


0*208 

6 

0*855 


0*071 


0*508 

7 

- 0*707 


- 0*044 


- 0*580 

8 

0*650 


0*318 


- 0*331 

9 

0*298 


- 0*849 


0*159 

10 

0*739 


0*182 


0*629 

11 

b *918 


- 0*236 


0*291 

12 

0*848 


- 0*372 


- 0*243 

13 

0*909 


- 0*365 


0*081 

14 

0*000 


- 0*865 


0*008 

15 

- 0*459 


- 0*009 


0-232 

16 

0*841 


0*298 


0*086 

17 

0*211 


- 0*071 

, 

0*623 



[No 1 


15 Stone — On the Interdependence 


Table VI. — Normalized Factors 1922-38 


Year. 




/’a. 

F.. 

1922 


-01316 


0-3258 

0’2131 

1923 


0-0800 


0-2837 

0-1707 

1924 


0-0752 


0-1380 

0-2877 

1925 


0-1680 


0-2084 

0-2009 

1926 


0-2581 


0-0667 

0-1338 

1927 


0-2405 


-0 0404 

0-1561 

1928 


0-2733 


0-0324 

0-1007 

1929 


0-3689 


0-0916 

-0-0795 

1930 


0-1424 


-0-5228 

-0-0015 

1931 


- 0-1549 


-0-5038 

0-1261 

1932 


-0-4537 


-0-3383 

0-2130 

1933 


-0-4607 


0-1583 

0-1118 

1934 


0-3111 


0-1583 

0-1753 

1935 


-01817 


0-1432 

- 0 -3019 

1936 


-0-0132 


0-1245 

0-4676 

1937 


0-1195 


0-0077 

-0-4748 

1938 


-0-0186 


-0 0866 

0-2733 


4. A Discussion of the Results 

We may begin by attempting to answer the three questions set out at the end of Section 1. 
First, it can be seen from Table IV that 97 5 per cent, of the combined variances of the seventeen 
variables can be explained by three orthogonal factors. The first factor accounts for 8 o -8 per 
cent, of the combined variances, the second for io *6 per cent, and the third for 61 per cent., leaving 
only 2*5 per cent, to be accounted for by the remaining thirteen* factors that could be extracted. 

Second, Table VII below shows the extent to which the individual variables are well explained 
by the three factors. The squares of the multiple correlation coefficients, ...» are cross 
classified by size and by the standard deviation of .v,, so that the extent to which the variation of 
the smaller variables is explained can readily be seen. 


Table VII. — Squares of Multiple Correlation Coefficients between Variables and Factors Classified 
by Size and by the Standard Deviations of' the Variables 


s/tV,). 


-0 499 
0-50-0 -749 
0-754) 949 
0-95- 


0-1-40. 2-00. :i-oo- 

2 ^ ^ 

1 . 1 .— 

2 . 2 .— 

3.2.4 


2 

4 

9 


8 


5 . 4 . 17 


We may conclude from this table that although the variation of the smaller variables is not so 
well accounted for by the three factors^ as that of the larger variables, a fact which is not surprising, 
since sums of products and not correlations form the basis of the analysis and the individual 
variances show large differences, even this variation is accounted for to a high degree. In only 
two cases out of eight in the column containing the variables with the smallest variance is the 

♦ The number of remaining factors is thirteen, making sixteen, not seventeen factors in all, because the 
number of years analysed is unfortunately small. Each variable is observed on 1 7 occasions only, and is 
brought into the analysis as a deviation from a mean. Of the 17 deviations of any one variable, only 16 
are linearly independent. The matrix of data measured from means has rank 16 only, and so has m. One 
of the latent roots of m is zero. 



1947] 


of Blocks of Transactions 


17 


Diagram 9 

Inventory Revaluation Adjustment 



square of the multiple correlation coefficient, i.e. the proportion of the variance of the variable 
accounted for by the three factors, less than In Just over half of all the seventeen cases 
exceeds 0*95. 

A consideration of the composition of the series shows that the possibility of large systematic 
variation is not exhausted by taking out the variation due to three common factors. For example, 
SUPP. VOL. IX. NO 1. c 




18 


Stone — On the Interdependence 


[No. 1, 


the variation of item 15, which is really a “rag-bag” composed of a miscellaneous assortment of 
small items, is largely explained by the three factors except for the year 1936, when a large veterans’ 
bonus was paid out. Disturbances of this kind are well known to be an important element in 
economic variation, and it is not surprising, therefore, that all systematic variation is not exhausted. 


DiaCiRAm 10 

Net Rent received by Individuals 

tq20 1<426 IQ30 |CI35 



by what can 1^ accounted for by three factors. In this connection it is perhaps of interest to 
inspect the residuals of the iri{n~\) ^ 136 sums of products after the variability due to the three 
factors has been removed. A frequency distribution of these residuals which may be written 

(Wjfc - S ajgai„) is shown in Table VIII together with the corresponding normal frequencies 

j -- 1 




19 


1947] of Blocks of Transactions 

Table VIII . — Frequency Distribution of Third Residual Sums of Products Compared with the 

Corresponding Normal Frequencies 

Fn'qupnoios. 


Baaed on nonnal 


Bangc of Besidtials. 

ObBerved. 

curve with same 
mean and Htiuidurd 
de\ iutiou. 

—4-0 to 

-3-5 

1 

01 

-3-5 „ 

--30 

1 

0-2 

-30 „ 

-2-5 

2 

0-9 

-2-5 „ 

-20 

1 

2-8 

-2 0 

-1-5 

3 

6-4 

-1-5 „ 

-10 

4 

12*2 

-10 .. 

-0-5 

17 

19*4 

-0-5 „ 

-00 

37 

23-9 

0 0 

0-5 

33 

24-3 

0-5 „ 

10 

22 

20-5 

10 „ 

1-5 

8 

13*3 

15 „ 

20 

2 

7-3 

2 0 

2-5 

1 

3-2 

2-5 „ 

30 

1 

M 

30 „ 

3*5 

1 

0-3 

3-5 „ 

40 

2 

0 1 

V 


. 136 

. 1360 


The mean of the residuals, 0 043, does not diOer significantly from zero in spite of the fact that 
the sums of squares as opposed to products have not been included. It hardly requires an appli- 
cation of the x’-’-test to confirm the impression that the chances against the observed residuals 
being drawn from a normal population exceed 100 to 1. The distribution of the residuals has an 
undue proportion of relatively small and relatively large residuals, positive and negative, which 
may perhaps be attributed to minor systematic sources of variation not accounted for by the three 
factors. It must be recognized however that these figures are difficult to interpret, for they repre- 
sent the sums of products of residues which in practice are almost certainly correlated. It is not 
clear, therefore, what distribution the residual sums of products would follow on the assumption 
that all systematic variation had been removed by the first m factors. 

There remains the third question, namely, the interpretation of the factors. It was the expec- 
tation of the writer before the analysis was undertaken that most of the variation of the seventeen 
variables would be capable of being accounted for by the three components : 

{a) Total income or some similar quantity. 

(h) Rate of change of (a), and — 

(c) A trend term representing the underlying tendency of the economy to expand or 
contract, or of its elements to do so relative to one another. 

What is the relation of these concepts to the principal factors extracted from the data? It is 
important to remember that the individual principal factors are mere arithmetical abstractions, 
chosen for certain convenient algebraic properties. There is no general reason for identifying 
them singly with underlying causes. Any equal number of independent linear combinations of 
the factors could be used to give precisely the same factorial representation of the data. The impor- 
tant thing that has been demonstrated is that, save for small residues, nearly all the variables can 
be reproduced by a three-factor system, and there has been found that set of variate values of rank 
three which approximate to the data most nearly.* If the residues are neglected, the adequacy 
of (a), (6), (c) for the representation of the data may be judged by the extent to which they them- 
selves lit into the same three-factor system. 

* That the calculated three-factor system is the best three-factor approximation considered as a whole is 
not immediately obvious, since we have merely chosen each principal factor in succession to be that which, 
taken singly, reduces the sum of squares of residues as much as any one factor taken singly can reduce 
it. It may, however, be proved that no ^-factor representation can reduce the sum of squares by more than 
the first ^-principal factors reduce it. 



20 


Stone — On the Interdependence 


[No. 1, 


Component {a) may be expected to fit in well, because it must itself be estimated by adding 
together several of the larger variables. There is a slight difficulty resulting from the imperfec- 
tions of the data. The variables fall into two sets, of which the sums are total income, /, and total 
outlay, o, which would be identical if the estimates were perfect. However, the correlation ^tween 
I and o is very high, r^o 0*993, so it will be convenient to refer to {a) as total income in what 
follows. 

Table IX sets out the zero-order correlation coefficients between the six variables, Fi, F i, Fs, 
/, A/ and /. 


Table IX. — Correlation Coefficients between the Factors and Certain Economic Variables 


F, 

F. 

Pv 

Fr 

^3- 

/. 

A/. 


0 

1 








F, 

0 

0 

1 

— 

— 


i 

0*995 . 

-0*041 . 

0*057 . 

1 . . 

— 

— 

Ai 

-0*056 . 

0*948 . 

^0*124 . 

-0*102 . 

1 

- 

t 

. -0 369 . 

-0*282 . 

-0*836 . 

-0*414 . 

-0*112 . 

1 


The series for A/ used to obtain the correlations in this table is based on quarterly data which 
are not quite as complete as the data for / itself though they are compiled by the same author.* 
The figure for A/ at time t is obtained by subtracting the figure for / in the year ending at the end of 
the quarter / — i from the corresponding figure for / in the year ending at the end of the quarter 
t Since quarterly estimates of / are available for 1921 an estimate can be made of A/ for 

1922. A corresponding estimate cannot, however, be made for 1938, so that the correlations 
involving A/ arc based on 16 and not 17 observations. • 

The coefficients in the equations for the normalized values of /, A/ and t in terms of the factors 
are the correlation coefficients in the bottom left-hand quadrant of Table IX. The sum of the 
squares of these coefficients for each variable gives the proportion of the variance of that variable 
which is accounted for by the three factors. Thus, in the case of /, F^.I 23 (0*995)’'* -f- ( — 0041)** 

+ (0057)** - 0*995. Similarly /?i,.j 23 and are equal to 0*919 and 0*915 respectively. 

Accordingly, the variables /, A/ and t fit reasonably well into the three-factor system and are 
represented on the spherical correlation map described above. 

The figures in the bottom right-hand quadrant of Table IX arc the zero-order correlation 
coefficients between /, A/ and t. The correlations between the three-factor representations of 
these variables can readily be derived from the coefficients in the bottom left-hand corner of the 
table, since, as already mentioned, these are the coefficients in the regression equations expressing 
the normalized form of the variables /, A/ and t in terms of the three factors. Since the factors are 
both normalized and orthogonal, so that SF,’** 1 and SFjFj ^ 0, the correlation between the 

t t 

three-factor representation of any pair of variates is given by the sum of the products of the co- 
efficients for like factors in the regression equations. Thus writing p for the zero-order correlation 
coefficient between the three-factor representations we have for example (0*995 x — 0*369) 
+ (— 0*041 X “ 0*282) -1 (0*057 x — 0*836) = — 0*403 compared with a value of ~ 0*414 for 
nt. Similarly p,^i -- == -- 0*102, while p^a/ — 0*143, compared with a value of -- 0*112 

for rai. 

It would be possible to give a rather more complete treatment of these interrelationships 
through the calculation of AFi, since this would make it possible to express the three-factor repre- 
sentations of the original data and the original data themselves in terms of F,, AFi and /. It 
would also be of interest to attempt to find a factor <1> within the F-system, such that its rate of 
change is as nearly as possible within the F-system also. In the present application it was 
thought that these extensions were not worth while, since they would have involved very consider- 


Barger, op. cit. Table XVIII, pp. 179-83. 



1947] 


of Blocks of Transactions 


21 


able additional calculation and the general conclusion that most of the variables, Xj, can be 
explained fairly accurately by an expression of the form — 

Xj == ai ct (10) 


seems to be established without their aid. 

The equations expressing the variables in terms of the factors are additive and can be summed 
to give i and <7 — i -f f, where e is the discrepancy between the two estimates, in terms of the 
three factors. In this way we obtain — 


0 - 49 1762 Ft -f 0-7014 F, + 0-0297 F, 

1 ^ 49-4353 Ft - 2-0313 F* -f 2-8339 F^ 


whence — 


£ = - 0-2591 Ft + 2-7327 F* - 2-8042 Fa . 
or, if t is normalized to give £* — 

^ _ 0-0445 Ft h 0-4697 F^ ~ 0-4820 Fa . 
This regression equation is illustrated in Diagram 18. 


. ( 11 ) 
. ( 12 ) 

. (13) 
. (14) 


The variance ratio for calculated over observed c, i.c. .123 " 0-4549, so that to some extent 
the variations in the discrepancy can be explained in terms of the three factors. Equation (14) 
shows that these variations are mainly to be accounted for in terms of the second and third factors 
rather than the first. In other words the discrepancy between outlay and income, apart from a 
constant term, tends to be large when income is changing and tends, rather unexpectedly, to 
increase with time. This confirms a conclusion previously reached on the basis of an analysis 
of variance.* 

At this point it will be convenient to consider the coefficients of the seventeen equations set 
out in Table V. These figures may be considered along with their squares set out in Table X 
below, which show the proportion of the variance of each variable, which is accounted for by 
each of the three factors. This simple partition of the variance is possible because the factors 
are uncorrelated. The figures in the table are equal to a^.jal^x^y The residual proportions in 
the last column are the complements of the ratios in the last column of Table IV. 

They will be considered in two groups: The income group, variables 1, 3, 9, 10, 11, 12, 14, 15, 
16; and the outlay group, variables 2, 4, 5, 6, 7, 8, 13, 17. 


Table X. — Proportion of the Variance of Each Variable Accounted for by Each of the Factors 

l*r<>portion acc<»initoil for by — 

Variable. — 








KcHldiial. 

1 


0-9666 

0-0229 


0-0089 

0-0016 

2 


0-9068 

0-0009 


0-0851 

0-0072 

3 


0-5018 

0-4918 


0-0006 

0 0058 

4 


0-9462 

0-0087 


0-0296 . . 

0-0155 

5 


0-8267 

0-1215 


0-0431 

0-0087 

6 


0-7311 

0-0051 


0-2581 

0-0057 

7 


0-4992 

0-0019 


0-3367 

0-1622 

8 


0-4230 

0-1010 


0-1093 

0-3667 

9 


0-0889 

0-7209 


0-0252 

0-1650 

10 


0-5467 

0-0332 


0-3961 

0-0240 

11 


0-8430 

0-0558 


0-0849 

0-0163 

12 


0-7194 

0-1382 


0-0590 

0-0834 

13 


0-8270 

. 0-1333 


0-0066 

0-0331 

14 


0-0000 

0-7486 


O-OOOl 

0-2513 

15 


0-2108 

0-0001 


0-0537 

0-7354 

16 


0-7078 

0-0885 


0-0074 

0-1963 

17 


0-0447 

0-0051 


0-3883 

0-5619 

S 

• 

0-8076 . 0-1059 0-0609 

• See The Economic Journal April, 1943, pp. 71-3. 

0-0256 



22 


Stone — On the Interdependence 


[No. 1, 


The three factors account for almost the whole of the variance of Xx^ employees’ compensation. 
The movement of this variable is very similar to that of Fi, but it will be noticed that it is negatively 
associated with the second factor, i.e. as income falls there is an offset to the associated fall in 
employees’ compensation. This characteristic is shared by all income payments with the exception 

Diagram 11 

Entrepreneurial Withdrawals 


njjt, |q40 



of net rents received by individuals. The opposite tendency can be seen in net saving of enterprises, 
a* 8, even after the sums used to write down inventories, Xo, are written back. 

Entrepreneurial withdrawals, Xxu and dividend payments, show a relationship to the factors 
somewhat similar to that of employees’ compensation. The correlations with Fi are lower and those 




1947] 


of Blocks of Transactions 


23 


with Fa are higher. The former shows a positive association with Fz (a downward tendency), while 
the latter shows a negative association with this factor. 

Interest payments, Xu> show what is almost certainly an untypical behaviour, being associated 
(negatively) only with the second factor. This is probably to be attributed to changing methods 


Diagram 12 
Dividends 



of company financing, and to conversions during the period of low interest rates follov'ing the 
depression of the early 1930’s. 

Net rents paid to individuals, Xio, show a small positive correlation with F* and a much larger 
positive correlation with F 3 . 

Net savings of enterprises adjusted for capital revaluations, jca, is almost wholly explained in 








1947 ] 


of Blocks of Transactions 


25 


approximately equal proportions by the first two factors. This item reflects the movement of 
inventory revaluations, which are mainly associated with the change rather than the level of 
total income. Variable X9 is an adjustment item needed to write back into total income the 
inventory losses which automatically appear as a deduction in 

Diagram 14 
Interest 



The remaining two income items are relatively unimportant. Variable x^a is another adjust- 
ment item needed to put the profit figures on the same basis in respect of depreciation and deple- 
tion as is adopted on the outlay side of the account. Variable Xi^ is a combination of small 
items not important enough to be treated separately. Its negative association with Fi is largely 



26 


Stone — On the Interdependence 


[No. 1 


Diagram 15 

Numerous Small Items Combined 




1947] 


27 


of Blocks of Transactions 

due to the fact that it contains direct taxes paid by individuals as a negative component. The 
inability of the three factors to explain more than some 26 per cent, of its variance arises mainly 
from the fact that, though a small item, it contains the veterans’ bonus, which appeared as a 
substantial item in 1936. 


Diagram 16 

Adjustment for Depreciation and Depletion 

iQia iqjO l‘135 •‘^40 



In the outlay group, consumers’ expenditures on goods, and move closely with Fj. Con- 
sumers’ expenditure on services moves somewhat less closely with Fi and is negatively associated 
with Fi. 

Gross fixed capital formation is represented by two items, producers’ durable goods, which 
are included in x* owing to the high correlation (r 0-954) between these goods and consumers’ 




28 


Stone — On the Interdependence 


[No. 1, 


perishable goods over the period, and construction, Xq. The latter is fairly closely associated with 
and also with i.e. it shows a downward tendency over the period. Depreciation and 
depiction, ;ct 3 , an adjustment item needed to convert gross into net fixed capital formation, shows 
a moderately high correlation with and a negative association with Fg. The change in inven- 


Diagram 17 

Foreign Balance including Foreign Tourist Expenditure in the United States 



tories, the final component of domestic net capital formation, is associated most closely with 
Fi, but is also associated positively with F, and negatively with Fg, i.e. shows an upward trend. 

Net public outlay, x^, shows, as might be expected, an untypical behaviour. It is negatively 
associated with all three factors. 




of Blocks of Transactions 


Diagram 18 

Discrepancy : Outlay less Income 


135 








30 


StOne — On the Interdependence of Blocks of Transactions 


[No. 1, 


The final component of outlay, the foreign balance including foreign tourists’ expenditures in 
the United States, .Vi 7 , is a small variable only very partially explained by the three factors. Much 
the most important contribution is made by Fa, Xn showing a marked downward trend. 


5. Conclusions and Suggestions 

This section will begin with a summary of the conclusions which seem to emerge from this 
investigation, and will end with some brief remarks on the possible uses of factor analysis in 
econometric work. 

(1) Starting with seventeen variables representing the components of total income and outlay 
in the United States, it has been found that the greater part, 97-5 per cent., of their combined 
variances can be explained in terms of three factors. Furthermore in most cases the individual 
variances are largely explained in terms of the three factors. 

(2) It is possible by comparison with series estimated from economic considerations and not 
from the factor analysis to identify approximately the first factor with total income, and to show 
that the second factor is closely related to the rate of change of total income. It must be 
remembered that the individual factors are mere mathematical abstractions chosen for certain 
convenient algebraic properties, one of which in the present instance is mutual orthogonality. A 
more meaningful approach to the matter therefore is to ascertain if certa in economically meaningful 
variables, such as / and A/ in this example, believed to be useful for explanatory purposes, belong 
approximately to the same reduced factorial system as the variables analysed, the x's in this 
example, rather than to attempt to identify individual economic explanatory variables with indi- 
vidual factors. It so happens in this case that Fi and i are highly correlated, but this is not the 
important point. What really matters is ihat /, Ai and t belong approximately to the same thrcc- 
factor system as the .v's. 

So much for specilic conclusions. We shall now proceed to more general topics. 

(3) It seems clear that factorial methods arc potentially valuable in the analysis of economic 
data, particularly in dealing with problems of classification, or the formation of ideal types of 
variation by which sets of variables such as transactions or prices may be characterized. The 
method of principal components seems well adapted to economic investigations, and leads to 
measures which have many desirable algebraic properties. From this point of view the purpose 
of factor analysis lies in its usefulness in the reduction of data, enabling us to replace a large number 
of scries by a small number which provide the principal components of variation of the original 
data. Subject to what is said in the next paragraph, we might restrict ourselves in the first place 
to an attempt to explain the variation of the first few principal factors of a set of variables rather 
than that of the much more numerous variables themselves. 

(4) Since the purely mathematical character of the factors has been repeatedly stressed in this 
paper, it may seem foolish to suggest that economic theories should be directed to the explanation 
of the variation of such abstractions. This objection would, however, be false in so far as it 
was possible by rotation to reapportion the variation among a set of different (in general oblique) 
factors which could be identified with “real” economic variables. This is the same point as was 
made in paragraph (2) above, and shows the way in which economics (or whatever the subject 
may be to which factor methods arc applied) comes into the whole proceedings. We may start 
by showing that a large number of variables belong approximately to a system with relatively 
few components of variation, say m in number, and we may then go on to show that certain “key” 
economic variables also m in number belong approximately to the same system, so that an explana- 
tion of their variation provides us with an explanation of the variation of all the variables with 
which we started. The selection of the “key” variables is a matter for the economist, though 
in his capacity of statistician he may ascertain that the first set that occurs to him will not do 
because it does not fit into a reduced system of the required number of dimensions. 

(5) In the light of (3) and (4) it may be helpful to indicate some actual cases in which factor 
analysis would probably be useful in economics. 

First, suppose we are interested in demand analysis. In trying to explain the amount of 
any commodity demanded we ought, in theory, to introduce as determining variables not only 
the commodity's own price, but also those of all the other commodities bought by the group of 



Kuznets' and Barger's Data for Income and Outlay in the United States of America 

1922-38 




32 


Discussion on Mr, Stone* s Paper 


[No. 1, 


purchasers we are considering. This is clearly imp>ossible, and what is usually done is to include 
a price index representing the movement of “all other prices.” In doing this we are allowing 
approximately for the first component of variation of the price-complex, but for nothing else. 
By applying factor analysis to the complex of price variations we could ascertain a few more 
components of variation, which together with the first component would enable us to make a 
better allowance for price influences than is obtained by the usual method. It would still be 
necessary in certain cases to deal with the price movements of closely competing or completing 
goods, but the method would give a better representation of the variation of “other prices.” 

Again, if we are trying to analyse the equations of motion of the whole economic system we 
shall normally begin with a large number of variables and equations. With the object of reducing 
the labour involved we may try to reduce this number as much as possible. There is, however, 
another method of approach, namely, to start ofl* with a system of any degree of complexity, 
extract the principal components of its variation, and concentrate on the explanation of these 
components or an equivalent number of identifiable components belonging to the same system. 
This, indeed, is a method of arriving at a set of “inner variables” which, if explained, will in turn 
explain all the other variables with which we started. 


Discussion on Mr. Stone's Paper 

Mr. Champernowne : It is a particularly agreeable task for me to propose a vote of thanks 
to Mr. Stone, who is a fellow of my own college at Cambridge, and with whom 1 have on so many 
occasions worked and discussed problems in economic statistics; but as my time is limited, and 
as 1 am now representing another university, I shall try to escape from these sentimental asso- 
ciations during my remarks, and confine my appreciation of the excellencies of his paper to a short 
•space and try to develop one or two minor criticisms on small points. 

The severely technical title of the paper first suggested that no more would be attempted than 
an account of the correlations existing between the various parts of the income and outlay of the 
United States during the period 1922 to 1938; but having applied the technique of factor analysis 
to these series, Mr. Stone had gone on to consider far wider problems concerning the explanation 
of the movements of the various series and other applications of factor analysis to such problems 
as demand analysis and elucidation of the equations of motion of the whole economic system, 
as discussed in the final paragraphs of the paper. 

Constructive optimism is characteristic of the most influential economic statisticians, such 
as the late Lord Keynes and Mr. Colin Clark, so that they arc impatient of the limitations of 
existing materials and methods, and their minds are always working to adapt existing 
material and statistical techniques to suggest the answers to the big central problems of economic 
theory and administrative practice. So it is with Mr. Stone, who during the war took hold of the 
idea of applying the principles of accountancy to national income statistics, and turned it into the 
impressive Government White Paper on National Income which the Central Statistical Office 
now provides for us every April. Again, starting from the idea of using Professor Frisch's con- 
fluence analysis to examine the demand for commodities, Mr. Stone gave to us nearly two years 
ago a comprehensive paper which was described in the discussion as the best in its sphere ever 
read before this Society. In that paper he derived estimates of the nature and importance of many 
economic influences which affect the quantities consumed of a large range of important commodi- 
ties, and he derived all these estimates from a quantity of basic data so small that most academic 
statisticians would have lacked the courage to publish their conclusions in full detail and offer 
them for the criticism of this so learned Society, Mr. Stone’s courage has been amply Justified 
by the fact that his results have not after all these months been refuted. 

The subject which Mr. Stone has attacked this evening is that of explaining the movements of 
various series of figures constituting the income and outlay of the United States during the years 
from 1922 to 1938. This problem is one which would exhaust the patience of almost any statis- 
tician. For when one studies the economic time scries of this period for either the United States 
or this country one soon discovers that all the series behave in a very similar fashion; nearly all 
of them seem to respond to the boom of 1929 and the slump of 1933. Some of them lag behind 
and others lead. Some of them show a trend increase and others a trend decrease; but apart 
from these divergencies, most of them seem to behave alike. The academic statistician may well 
be tempted at this stage to throw in his hand in despair, and say that the series are so completely 



1947] 


Discussion on Mr. Stone's Paper 


33 


mixed up with each other that there is no way of telling which is the cause of which. It would 
be always possible to account for any one series in terms of a linear trend and the movements 
of any two of the other series : provided that the fluctuations of the other series were not exactly 
in phase with one another. 

Mr. Stone has not given up. He has taken the bull by the horns and said, “These series seem 
to be highly intercorrelated and therefore easier to explain because we can explain all of them in 
terms of two key series and a linear trend, and all that will remain to be done will be to explain 
these two key series.” 

Using the admirable device of a spherical map, Mr. Stone has been able quite adequately to 
represent the seventeen series of observations by seventeen points, with circles round them to indi^ 
cate the margins of error. The distance between any two points represents the amount of cor- 
relation between their corresponding series, the precise mathematical relation being that the cosine 
of the distance between the two points represents, within limits of sampling error, the correlation 
coefficient between the two series concerned. The fact that this mapping can be done at all, 
within the margin of the sampling errors, is due to the high interdependence of the series, and also 
to the high margin of error due to sampling error for the series covered. A spherical map tells us 
all we can learn about the true intercorrelation from information so inadequate. 

In passing, I think this point about the very large margins of error and the fact that spherical 
mapping only is possible because the data are inadequate may possibly bear on some of the points 
discussed in Mr. Stone’s earlier paper about Demand Analysis. It may be suggested that the 
margins of error there are perhaps rather higher than one would otherwise suppose, but I think 
Dr. Barna is later going to develop that point, so 1 will leave it on one side. 

The question may be put, about this mapping on the sphere, whether, in view of the roughness 
of the information available, even a spherical map is needed. Can we even distinguish three 
factors on the surface of a sphere, or ought we sometimes to be content with two? If we may 
focus our attention on the three influences of (1) trade cycle sensitivity, (2) time lag, or time 
lead, (3) trend increase or trend decrease, it is pertinent to ask whether, having measured the trade 
cycle sensitivity for any one series, we can still measure both its time lag and its trend increase. 
My own belief is that only a minority of the series here given provide sufficient information to 
enable this to be done with significant results. Most of the series will allow a significant measure- 
ment of trade cycle sensitivity and of time lag — or alternatively, of trade cycle sensitivity and trend ; 
only a minority will provide significant measures of all three. 1 think Mr. Stone himself would 
allow that, since the circles drawn round the points on his sphere show the margin of error. If 
this were so he could confine himself to two great circles on this sphere cutting each other at right 
angles. He could even represent it on a blackboard, in that case by drawing two circles on the 
blackboard ; but it also means that even less conclusions could be drawn about what is the cause 
of what. 

This negative conclusion is based on an attempt to explain eight of Mr. Kendall’s random 
auto-regressive series* in terms of Mr. Stone's three factors. These series oscillate in much the 
same way as economic series, but are generated by a random process. On the average, each factor 
explains about one-eighth of the total variance of any auto-regressive series. This equals one- 
fifth of the residual variance left unaccounted for by the three factors. But in some cases a factor 
explains as much as one-third of the residual variance, and this suggests that whenever in Mr. 
Stone’s analysis a factor accounted for an amount of variance less than one-third of the residual 
variance this explanation was not really significant, because it might quite well have happened by 
chance as in the case of the auto-regressive series just considered. 

If we turn to p. 21 and compare the figures in column Fg and Fg with the residuals, we find 
that on this test five figures in column F^ are not significant and seven figures in Fg ; so are three 
figures in column Fj. Only in the case of variable numbers 1, 4, 5, 6, 10, 1 1 and 12 arc all three 
coefficients significant. 

My time is practically up, and I do not wish further to labour the question of the unreliability 
of information extending over sg short a p)eriod. Mr. Stone has laid considerable stress on the 
fact that each of his three factors has a high total correlation with the three variables, income, rate 
of change of income, and time. Mr. Stone had himself foreseen that this was likely, and it is 
no criticism of his paper to suggest that the explanation is fairly simple. I think I am right in 
saying that the total of his seventeen series almost exactly represents income ; it is in fact almost 
exactly equal to income, and owing to the method by which Mr. Stone has chosen his series and 
to the fact that most of the factors are correlated with each other, I think it is inevitable that 

* Kendall, M. G., Contributions to the Study of Oscillatory Time Series, 

SUPP. VOL. IX NO. 1. D 



34 


Discussion on Mr, Stone's Paper 


[No. 1, 


the first factor should be highly correlated with the total income.* The fact that the main diffe- 
rence between the other series is one of time lag automatically ensures that the other factors must 
be highly correlated with the rate of change of income, because if you add a small multiple of the 
rate of change of a factor on to the factor itself, it has roughly the effect of just advancing the 
series in time, making everything happen a little earlier. 

1 think that explains why the second factor is so highly correlated with the rate of change of 
income, or, at any rate, with some linear function of income itself and the rate of change of income. 
(The second alternative is in this particular example avoided, because the correlation between 
income and the rale of change of income happens to be very low.) The correlation with time 
itself is admittedly not quite obvious, and it is not so easy to show why that correlation should 
be expected; but it seems to be natural that you would have time represented in the third dimen- 
sional factor because income itself has no very steady trend in time, nor has the rate of change 
in income, yet some of the series represented change quite a lot over the whole of the period. In 
saying this I am not differing from Mr. Stone, who did himself say that it was a thing to be expected. 

There are one or two general remarks to be made about applying factor analysis to economic 
time scries. It is worth pointing out that there are certain dangers in using factor analysis based 
only on correlation coeflicients for purposes of classification. For if we rely on the coefficients 
we shall classify together any two highly related scries, even though they are unalike. Thus motor 
vehicle output may have a rapid trend increase and be sensitive to booms and slumps. Bread 
production may have a gentle trend increase and be fairly insensitive to booms and slumps. In 
this case the two series may well be highly correlated, but to classify them together would be 
misleading. The point is illustrated in the accompanying chart, which shows graphs of two 
such series which behave quite differently, although there is perfect correlation between them. 





One would not want to classify these as identical. From the economic point of view they may 
represent quite different kinds of influence, but they arc very highly correlated, as you can .see 
from the diagram. They would automatically be classified together on the factor analysis diagram. 

One other point upon this is that despite the difficulties I have stressed in applying this factor 

* I agree, however, with Dr. Geary's point in the discu.ssion, that the observed value of 0*995 is higher 
than one would expect : 0*97 is the value I should expect on the average, and the high value of 0*995 is a 
bit of a fluke, but not a very violent one. 



1947] 


Discussion on Mr. Stoners Paper 


35 


analysis technique to economic time series, there is one small compensating advantage, namely, 
that it is possible to take advantage of the fact that the number of years is fairly small, and there 
is this continuous variation from year to year, in order to effect certain computing economies. I 
think it would be possible to compute the three main factors, or something very like them, without 
having to work out the correlation coefficient between each pair of series. Taking the approxi- 
mation of the first factor, by applying weights plus I, 0 or minus 1 to the various series, choosing 
these weights in the first place by inspection, one would then get the second approximation for 
the first factor by applying as weights to the series their various correlation or covariance with the 
first approximation to the first factor. One would then proceed in like manner to find approxi- 
mations to the other two factors. But I will not spend time developing that point now : obviously 
the amount of work to be done would only increase in direct proportion to the number of series, 
whereas in the ordinary way the amount of work increases with the square of the number of 
series to be examined; so if the number of series is large, considerable computing economies may 
be effected. 

In conclusion I would like to ask Mr. Stone to develop the remarks made in his final paragraph, 
where he suggests that factor analysis may be useful when we are trying to analyse the equations 
of motion of the whole economic system. Although from my own pessimistic point of view I 
find it difficult to sec how factor analysis can be used to explain anything in economics at all 
except in the sense of describing the correlation coefficients observed, the idea of using factor 
analysis for showing inner variables sounds very fascinating. 

I have great pleasure in proposing a vote of thanks to Mr. Stone for his lucid and entertaining 
paper. 

Mr. Babington Smiih; I have much pleasure in seconding the vote of thanks to Mr. Stone 
for his interesting and stimulating paper, I think there are good reasons why factor analysis 
should be tried outside psychology, the field in which it was born and grew up, and \ am very 
glad this is being done. In the psychological field there has been a tendency to sectarianism, 
which I think is quite largely due to the present state of ignorance about the subject-matter and the 
consequent disagreement as to the interpretation of the abstract factors. The procedure of factor 
analysis is wholly mathematical, and in that sense should be applicable to any correlated sets of 
measures. 

The question of the relative merits of the various mathematical processes has, 1 think, been 
fairly well thrashed out, and the main problem now is one of interpretation. While the use of 
factor analysis in fresh fields is all to the good, I should myself prefer to see the methods applied 
to cases where an answer is known rather than in economics— another field where, 1 believe, 
interpretations may still not find unanimous acceptance. 

I am much interested in his argument at p. 8 in favour of the use of squares and products as 
against correlations. I think he correctly represents the trend of development in psychology to 
design tests a priori, but 1 am inclined to think the method is most useful in mapping an unknown 
field, and 1 would prefer to carry out the inverse process of assessing the relative importance of 
a few factors by another method— perhaps analysis of variance. 

In the paper itself there are several points I would like to raise with Mr. Stone. I do not know 
the answers and I am seeking information. The first point is: By all the standards developed 
in psychological work, the size of the group -that is to say, the number of years considered — is 
very small; and it is also small when you consider the number of variables that have been used. 
This means that the correlation coefficients have standard errors of considerable size, and surely 
there are discouragingly few degrees of freedom in determining the new axes. I think there are 
methods for assessing the significance of factors and factor loadings ; and certainly there are methods 
for seeing whether more than one factor is necessary. In view of this, perhaps the most surprising 
thing about the paper is the very close fit which is achieved by the regression equations, and I 
cannot help feeling— I think it was the point made by the last speaker, Mr. Champernowne— 
that much of this is due to the use of unstandardized squares and products and the preponderating 
effect of income itself. 

The second point is that 1 notice the period under review covers years in which 1 have heard 
there was a great boom and a great slump, and I would like to know what effect this has had on 
the resulting factor analysis. If there were conceivable a halcyon period from which boom and 
slump were absent, would the same factor pattern appear ? 

The third point: What is the effect of constraints which would be incurred from the fact that 
certain constituents add up to income and others add up to outlay ? What is the result in loss of 
degrees of freedom? Were the estimates of the constituents independent? Were they, for 
instance, independent, for successive years ? 



36 


Discussion on Mr, Stoners Paper 


[No, 1, 


The fourth point is that I am very much puzzled to see how A/, the change in income from 
year to year, can come out as a factor from a factor analysis of a set of variables which together 
represent income and outlay. Nothing in the correlation table, or in factor analysis as a method, 
relates to the order of the members in the sample. The nearest I can get to an explanation is that, 
if Mr. Stone had correlated years (which we may call Q analysis), not parts of income and out- 
lay (which we may call R analysis), there might have been greater correlations between neigh- 
bouring years, and that this phenomenon would have made itself felt in the analysis as a factor 
representing neighbourliness or the resemblance between successive years. Then, on the principle 
that the situation underlying the Q or analysis is the same, if a factor of resemblance between 
successive years comes out in the Q analysis it must have an analogue in the analysis. 

1 am not quite satisfied by this explanation, and should be glad to have Mr. Stone’s views on 
this and my other points. 


The vote of thanks was then put to the meeting and carried unanimously. 


Sir Cyril Burt said he had listened to Mr. Stone’s paper with a twofold interest, first because 
he had always been attracted by the possibility of applying the methods of factor-analysis to 
economic problems, and secondly because, when Mr. Stone and Miss Potter had first consulted 
him about factorial methods, he had himself ventured to try his hand at an elementary factorization 
of the figures they had sent. 

As everyone knew, difierent psychologists had proposed many rival procedures to discover 
and estimate what they rather inappropriately had called “factors,” and economists might feel a 
little doubtful as to which of these methods, if any, they were to adopt. Elsewhere he had sought 
to show that all of them could be reconciled, if they were regarded as convenient working pro- 
cedures -for approximating to what he himself considered to be the best and fundamental method 
of all- that of least squares as applied to the matrix of variances and covariances (or, more 
accurately, of square sums and product sums). It was, therefore, gratifying to find that, on quite 
independent grounds, Mr. Stone had eventually decided that this procedure was on the whole 
most fitted for his purpose. It might be noted that the use of principal axes as factors was 
originally suggested by Karl Pearson {Biometrika, I, 1901, p. 209), and was thus older than any 
other method of factorization. 

Curiously enough, this was a procedure hardly ever adopted by psychologists; even Hotelling, 
in the reference Mr. Stone had given, had applied the technique to correlations, not to covariances ; 
and Professor Spearman had always opposed the use of covariances. Sir Cyril Burt thought 
the real reason had been that the calculations required were so lengthy and elaborate. Since that 
might possibly deter economists who would like to experiment along factorial lines, he might 
perhaps mention that, at any rate for preliminary studies, there were much simpler methods 
available. Miss Potter, he thought, originally had tried using one or two of the short-cuts that 
he had used in factorizing physical measurements; and he himself, with the help of one of his 
assistants, had tried the effect of applying what was called “simple summation” (which Thurstone 
had re-named the “centroid” method) to the correlations derivable from Mr. Stone's covariances. 
So far as Sir Cyril could sec, the factors that emerged were almost exactly the same, though the 
detailed figures were slightly different. 

The main dilTercncc was due to the fact that, as Mr. Stone had pointed out, the observed 
variables differed so widely in their variance, as shown in his Table 1. Sir Cyril Burt was inclined 
to suggest that, in applying the technique to a re.atively new field, both methods ought to be tried. 
It always seemed possible that a variable which had a very small variance might, nevertheless, be 
one of the variables which furnished the best clue or guide to the variations of all the others. In 
such a case, factor- loadings based on covariances might lead to an under-estimation of its influence. 

Whatever method were adopted, however, there could be little doubt that all the variances 
exhibited by the 1 7 variables in the initial table could be explained, with a very trifling remainder, 
in terms of three factors, and three factors only. Though the number of years over which the 
correlations were calculated seemed much smaller than was usual in factor-analysis, nevertheless 
that result seemed a most remarkable demonstration of the value of factorial analysis in economic 
research. 

The nature of the three factors was, he believed, rather what might be expected on a priori 
grounds. In anthropological work the nearest parallel he could think of was the factorial analysis 
of assessments for physical growth and educational development, based on tests and re-tests over 
a period of years. There, too, much the same three factors emerged, though not always in the 
same order. The first factor was usually an average of all the assessments ; it might be called the 



1947J 


' Discussion on Mr. Stone's Paper 


37 


child’s general level. The second factor represented rate of change or rate of growth. The third 
factor, of course, was time, which, in the case of developing children, meant chronological 
age. 

Mr. Stone’s spherical map was itself highly suggestive. For explanatory purposes Sir Cyril 
and his students had always found it illuminating to plot the three main factors on a blackened 
globe, like those used by the geographer. But there was a further device which might be usefully 
employed, both for illustration and even for approximate calculation. As geographers had long 
ago pointed out, the positions of points upon a globe could, by certain devices, be represented on 
a plane surface. With what was called gnomonic projection, a map could be obtained on ordinary 
squared paper. Better still, for many purposes, was the diagram obtained if one borrowed the 
stereographic projection-nets which were used for similar purposes in crystallography. 

Turning to the more general implications of the paper. Sir Cyril Burt said that all psychologists 
would listen with interest to the criticisms of factorial techniques that might be advanced by 
professional statisticians and economists. Many psychologists were, of course, still a little sceptical 
about the validity of those techniques. The earliest factorists to discuss or criticize those methods 
regarded them as in some way peculiar to psychology as such. They had supposed that we had 
here a new instrument peculiarly adapted for discovering the fundamental faculties of the mind. 
He himself, however, believed that, reduced to its simplest terms, factor-analysis was merely a 
convenient device for averaging. The first factor to be extracted was nothing else but an average. 
— at its best an appropriately weighted average — of all the variables in the table. The second 
factor was a weighted average of the deviations about the first factor; and so on. That being so, 
the statistical technique ought to be applicable to almost any set of complex data where correlations 
with such averages were desirable — to physical measurements, medical data, anthropological 
data, sociological data, and the like. Indeed, the devices developed independently by the 
quantum physicists for what they called “spectral analysis" turned on precisely the same 
principles. 

If that were true, then it seemed to him that the technique stood or fell on its own mathematical 
merits, quite regardless of its value in the narrower field of psychology. With that interpretation 
in mind, and with the help of his colleague. Dr. Rosenstein Rodan, some of his students had 
often tried their hand at factorizing economic data, and were eventually led to the conclusion that 
the method should be useful for numerous problems where the primary object was to find the best 
weighted average or to investigate independent factors — for example, in determining various 
economic indices, the cost of living index, and so forth. Indeed, one of his students, Mr. Hammond, 
had made an attempt at his suggestion to apply factorial analysis to variations in prices, etc., 
in different countries during a period which had been much the same as that covered by Mr. Stone. 
Mr. Hammond had briefly reported some of his efforts in the last number of the British Journal of 
Educational Psychology. 

In the field of economics, of course, added Professor Burt, psychologists were amateurs, not 
to say trespassers. But he thought all who were concerned with the scientific analysis of complex 
data, such as form the foundations of all the human sciences, would welcome the intensive and 
exceptionally thorough investigation that Mr, Stone had carried out. 


Dr. Barna said he was very glad to read Mr. Stone's paper; he had had only forty-eight 
hours or so in which to think about it, but he was sure that economic statisticians will benefit greatly 
by reading the paper. He had managed to clear up some of his doubts, which had troubled him 
for a long time, by reading the conclusions of Mr. Stone's paper. 

He desired to call attention, not so much to the technical details of the paper, as to its economic 
implications, and to the purpose of the method of factorial analysis as against the purpose of other 
methods, such as the method used in Mr. Stone’s paper about two years ago. 

Mr. Stone had managed to explain short-period variation in a great number of factors in terrns 
of three major factors, and he was very glad to see that a very simple and satisfactory economic 
interpretation could be found .for that, and he was also glad to see that there was nothing 
surprising about this economic explanation. 

Of course, the fact that fluctuations in total income were the chief explanation of short-period 
fluctuations of components of the national income was the basis of Lord Keynes’s theory, and the 
fact that the rate of change of income was another factor to be introduced in the Keynesian 
system had been pointed out by Professor Haberler in his book. Prosperity and Depression. 

But it was one matter to deal with the interpretation of explanations of short-periods of fluctua- 
tion, and quite another matter to make use of those explanations in forecasting. It was, of course, 
forecasting which was the primary purpose of making those explanations, quite apart from 
curiosity, and he was afraid he was very pessimistic on the subject of forecasting, and he desired 



Discussion on Mr, Stone's Paper 


38 


[No. 1, 


to point out what could be called the economic consequences of Mr. Stone’s present paper, 
especially on the results of his earlier paper. 

When reading Mr. Stone's earlier paper the speaker had not liked the results, and as an econo- 
mist he had not been prepared to accept the actual figures contained in the results. He thought 
the price-elasticities were too high and the income-elasticities, on the whote, too low; and he also 
thought that the error in the two elasticities must have been correlated; thus, if one were too low, 
the other must be too high. 

The present paper gave a clue to the previous problem and he was sorry to say it confirmed his 
fears, namely, that there was a very high correlation between the 1 7 series (and although the present 
data were not identical with those used previously, he thought very much the same result would 
be reached), and hence, as Mr. Champernowne had pointed out earlier, the margin of error in the 
incomc-clasticitics and price-elasticities must be very high — probably so high that the method of 
using multiple correlation analysis of short-period time scries in order to discover price and income 
elasticities would have to be given up altogether. 

The fact that various economic series were highly correlated was a useful fact in certain 
contexts; for instance. Dr. Rhodes’s index number of business activity could be very economically 
constructed because of that, or Mr. Stone’s conclusions could be used to extrapolate some of the 
time series. He preferred to use the expression “extrapolate” rather than forecast. The funda- 
mental fact was that all our series were derived from the movements of a free economic system 
in the United States in a short period, when everything moved up and down together, and that 
would not tell us anything, for example, about the demand for a single and important commodity 
the price of which was to be raised substantially by a duty ; or what happened to various factors 
when the government kept the rate of interest stable and did not let it fluctuate in a way in which 
it had fluctuated in the past. In so far as there was no independent variation in the system 
in the past, the forecasting based on historical data would break down. 

Uc was, however, more optimistic than Mr. Stone as regards the suggestions for further work 
in this type of research. Mr. Stone’s main attitude had been that all the diflicultics were due to 
the fact that our samples were small, and he had wanted to enlarge the sample by having time 
senes for longer periods than we had had in the past. 

Dr. Barna was not sure that that was the right way to enlarge the sample. A sample could 
be enlarged in more than one dimension, and he was not sure that this was the right dimension. 
The purpo.se of enlarging the .sample was to catch the independent variation of factors in the 
economic system. In the short period we had no independent variation. We had a stability in 
some institutional factors and also high correlation between the various economic variables ; by 
lengthening the scries we inight get some independent variation; but on the other hand, we might 
dcstioy the assumption of institutional stability; for instance, the introduction of a tariff, in dealing 
with imports, did very much more than introduce independent variation; it destroyed the whole 
institutional stability of the economic system and cut the time series into two. He was afraid 
that lengthening the time scries too much might do more harm than good. He would like to hear 
Mr. Stone s opinion on that point. As an alternative. Dr. Barna suggested enlarging the sample 
m other directions. 

He had two main suggestions to make : one was to include an analysis of family budgets and 
try to collate the results obtained by this method^ with the results obtained from time series. 
Actually, he considered the best example for that Was a paper by Mr. Stone, in the Review of 
Economic Studies in 1938, where the latter had tried to estimate the Keynesian multiplier by three 
independent methods. 

1 he other suggestion Dr. Barna desired to make was again tying up with what Mr. Champer- 
nowne ha<l said earlier-- that the forming of composite time scries from single series should be 
placed on a more scientific basis than has been done in the past. The fact that two series cor- 
related did not seem to be a sufficient reason for aggregating them ; a high degree of economic 
substitution between two commodities might be a more valid reason for lumping them together. 
Perhaps Mr. Stone will consider the methods u.sed by the late Erwin Rothbarth when he 
dealt with a similar problem in conncctitm with index numbers in the Review of Economic 
Studies. '' 


Mr. C. R. Rao said that Mr. Stone in his paper started with the expectation, before the analysis 
was undertaken, that most of the variation would be capable of being accounted for by the three 
components : 

(a) Total income or some similar quantity, 

(b) Rate of change of (n), and 

(c) A trend term representing the underlying tendency of the economy to expand or 

contract, or of its elements to do so relative to one another, 



39 


^947] Discussion on Mr, Stoners Paper 

and came to the conclusion that his exp^tation has been confirmed by the factor analysis, and 
that most of the variables Xj can be explained fairly accurately by an expression of the form 

Xj = ai -f ct. 

The method was to identify this system of economic variables with three principal factors which 
account for 97’4 per cent, of the total variance. 

This achievement was more the triumph of an economist who could discover the economic 
forces which had brought about the observed pattern in the data than that of the factor analysis 
used as a method to establish the theory. Having felt that a system of economic variables would 
determine the pattern of another set of variables, one could think of directly establishing a connec- 
tion between them without going through the stage of determining the factor pattern of the latter 
set and searching for agreement with the former. This led to the following logical problems 
concerning two sets of variates jti, . . . x^ represented by x and Vi, Va . • . » (.v > /*) 
represented by y: — 

Problem (a): Whether y forms an exhaustive set for explaining the variations in x. We 
might conventionally and conveniently define a set such as y as exhaustive for .y if at least 95 per cent, 
of the variation of x is explained by y. 

Problem (b): Whether the set y forms a minimal exhaustive set in the sense that it does not 
remain an exhaustive set if one t)r more variables in it are not considered. 

Problem (c) : Whether there are any factors in the set x which are entirely independent of y. 

Tn answering problem (a) we had only to fit regression functions of individual .v's on the y's 
and examine how much variation was explained in each case. A table giving the ratios of variation 
due to v's to the total variation, similar to Table IV of Mr. Stone's paper, where he gave the sum 
of squares explained by the factors for each of the jc's, supplied all the information relevant for 
the purpose. If y did not come out to be an exhaustive set, we must enquire how many factors 
were necessary besides the set y to get an exhaustive set and how one could extract them in a 
hierarchical order. This could easily be done by finding the principal components of the residual 
dispersion matrix. If the residual variables were represented by a*'i, . . . and the set by x' 
then we had to extract the principal factors, using x' in the manner adopted for .v in Mr. Stone's 
paper. Further observable variables might have to be searched for in addition to y to explain 
important factors, if any, arising out of the residuals. 

I'his method was of particular importance in dealing with time series, where one could always 
determine how the time factor was affecting the individual series. It might not be linearly con- 
nected with the variables. In fact, if one were dealing with a long series, the regression of a variable 
on time would be in the nature of a polynomial trend superimposed on which are periodic fluctua- 
tions capable of being represented by Fourier components. The complex pattern introduced 
by the time factor might be removed and the factors affecting the residuals might then be studied. 
If this were not done, and the usual method of factoring were followed, one might get a large 
number of important factors, many of which might have to be identified with the complexities, 
which were difficult to determine at this stage, introduced by a sinft/e factor, as, for instance, the 
tipic factor. 

Problem (b) was important from the point of view of an economist who was seeking for a set 
of directly observable variables to explain the variations in another set so that even if the whole 
data could be represented by a fewer number of abstract factors, a larger number of directly 
observable variables might be necessary to interpret the abstract factors. One had only to satisfy 
that the set y did not remain an exhaustive set if a variable from this was not considered. This 
could be done by examining the ratios of variances explained by the whole set y and a subset 
of y. It might be noted that the factor pattern of a obtained by the method of principal com- 
ponents can give no indication of the nature or number of observable variables in the set y. 
When once .v's are determined from other considerations factor analysis for merely establishing a 
connection does not seem to be necessary. 

To answer Problem (c) one might proceed as follows: Let n/j be the covariance between 
Xi and Aj and bu between xi and yj. We want to find the most important factor expressible as 
a linear function of the a's, which cannot be predicted from the observed .set of y's with any degree 
of confidence. 

If ^bxiy such that = 1, is a factor, then we have to maximize 


X == aq 



40 


Discussion on Mr, Stone* s Paper 


[No. 1, 


subject to the conditions 

S/,* - 1 

'Lhhij = 0 fory == 1, 2, . . . r. 

The maximum value of X is a root of the determinental equation 

I Qij - X bfg i 

1 = 0 . 

bnr 0 ' 

This gives rise to an equation of degree s — /*, and the factor corresponding to any root can be 
determined in the usual way. Only those factors whose variances are appreciable need be con- 
sidered. Time here did not permit a lengthy discussion of these problems, and Mr. Rao hoped 
to treat them elaborately in a subsequent communication, where the intrinsic connections of these 
problems together with the use of canonical correlations of Hotelling would be considered. He 
was not, however, sure how far these problems would be of use in economics. 

In this connection he might mention here that in the Indian Statistical Institute the method of 
principal components had been employed in slightly modified forms in various problems merely 
for reduction of data, i.e. to replace a large number of observations on an individual by a relatively 
few functions of these variables, which preserve the configuration of observations relating to a 
collection of individuals. 

In one case this method had been employed to distinguish the racial traits of a number of 
castes and tribes living in the same geographical area. In another case families were classified 
according to expenditure patterns by taking a single function of the proportionate expenditures 
on various items. An interesting result had been established that expenditure patterns were not 
entirely differentiated by the level of expenditure and family composition. These problems were 
discussed in Sdnkhya, the Indian Journal of Statisiic'j (Vol. 7, p. 425 and Vol. 8, p. 201). 

Mr. Herdan said that Mr. Stone had given an admirable example of how to apply factor 
analysis to economic variations. 

There was one technical point he desired to mention in connection with Table VIII, which Mr. 
Stone had said he was not quite satisfied with himself. To compare the frequency distribution 
of the third residual sums of products with a normal distribution — as was done in that table — 
would be quite legitimate in the case of the Spearman or Thurstone analysis, both of which worked 
with specific factors. Anything which was not accounted for by the so-called common factors 
was lumped together in the residual or uniqueness portion of the variance which contained the 
specificity and the unreliability portion of the variable. The Hotelling analysis, however, worked 
with common factors only, all principal factors being here common factors. There was, strictly 
speaking, no room in it for the uniqueness and, consequently, for a random or unreliability portion 
of the variance. 

He would, however, agree with stopping factorization, as Mr. Stone had done, after three 
common factors had been extracted from the variables, since they included by far the greatest 
part of the total variance. The remaining part of the variance was not due to a random element, 
but to the fact that the approximation, which is implied in duplicating a number of variables 
by as many principal factors, had not been carried out to the limit. He agreed that nobody 
would do this in practice. There would be little use in factor analysis if, for the description of 17 
variables, we needed 17 common factors, but from a theoretical point of view it was difficult to 
see what could be done in the case of the Hotelling analysis to satisfy oneself that the analysis had 
been carried sufficiently far. 

This applied in the present case all the more because for the communalities unity was inserted 
in the diagonal of the correlation matrix, which implies that it is intended to factorize the total 
variance and leave no residue whatever. 

He would like to draw attention to the fact that whilst the first and second economic variables 
were positively correlated with the first and second factors respectively, the third economic variable 
was negatively correlated with the third factor, which was difficult to understand. He suggested 
that the Thurstone process of “rotation’' might be used for obtaining a “simple structure” with 
only positive loadings. 

It was not often that one met with factor analysis applied to a matrix of correlations containing 
also negative correlation co-efficients, and he- would like to ask whether, applying the Hotelling 
analysis, the row sums were formed algebraically ? 

Mr. Stone intervened to explain that it was formed algebraically, taking the signs into 
account. 



1947 ] Discussion on Mr, Stoners Paper 41 

Mr. Herdan asked if, for the purpose of obtaining the latent root of the matrix, the division 
was carried out by the absolute highest row sum. 

Mr. Stone agreed that this was so. 

Professor Allen said that the main economic points had been well put by Mr. Champernowne 
and Dr. Barna, and he would like to say a few words from the point of view of the ordinary 
economist, attempting to understand the subtle work of statisticians like Mr. Stone — the kind of 
economist who had no mathematical flair and little knowledge of statistics, but a great enthusiasm 
for empirical investigation. 

He believed such an economist had appreciated the applicability of the analysis of variance 
to economic data while recognizing that not so much could be expected as under controlled 
experimental conditions. Henry Schultz and many others had taught the economist how to use 
multi-variate linear regression analysis. But the margins of error of the resulting regression 
coefficients were so wide that the depressing conclusion emerged that significant estimates of such 
economic concepts as demand elasticities are unlikely to be obtained from such regression analysis. 
The economist then looked hopefully to the “bunch map" technique of regression analysis as 
developed by Professor Frisch and recently applied by Mr. Stone in a paper read before the 
Society. 

He pointed to another method which was very much used in the U.S. in the ’thirties — for 
example by the economic statisticians of the Department of Agriculture—and which was a simple 
version of Mr. Stone’s present method. In this method, one variable— perhaps a price or an 
output — was selected to be “explained" by other variables. A simple regression was first taken 
on the variable regarded as the most important of the “explaining" factors. Residuals from this 
regression were then related by a second regression on the next most important of the "explaining" 
variables, and so on. 

This process, he thought, is open to obvious and severe criticism and, in the end, it reduces 
to a question of whether the user of the method, or anybody else, can know what "explaining" 
variables to take and in what order. Mr. Stone’s present method is a big improvement on this; 
he allows the system itself to say which arc the relevant factors, and he takes them all together 
instead of one by one. The difficulty is that the factors which turn up are purely arithmetical 
constructions which still need to be interpreted in economic terms. The step forward taken by 
Mr. Stone is to push forward the point when judgment and interpretation are needed to the very 
end of the analysis, where it should be. 

One minor point, right off the paper, he wished to make concerned the terminology of variance 
and co-variance. He thought that exposition would be much easier if these terms could be used 
for sums, and not for means, of squares and products. But he thought that it was probably too 
late in the day to suggest such a reform. 

Dr. Geary said that he would intervene in the discussion only to make a remark on Mr. 
Champernowne’s observations on Table IX of the paper. When he first read the paper he had been 
much struck by the results given in Table IX, and he could not agree with Mr. Champernowne 
that the correlations shown were to be regarded largely as arithmetical phenomena. Mr. Stone 
had succeeded in expressing many economic variables as linear functions of three factors, which 
factors were found to be highly correlated respectively with national income, with the first difference 
of national income and with time. Dr. Geary agreed that, since the factors were determined by 
maximizing processes, positive correlations found were to be expected, even if one were dealing 
with random series. The point of Mr. Stone’s results was that the correlations found were so 
very high. The factor Fi was a linear function of the original variables, and it was true the corre- 
lation between Fx and national income would be exactly unity if all the coefficients were unity, since, 
as Mr. Champernowne said, the spm of these variables was twice the national income. Reference 
to the second column of Table IV would show, however, that the coefficients of the xj in the 
expression for Fi were very different from unity, and this was what rendered so remarkable the cor- 
relation of *995 which was found. Nor did the algebraic process from which the second factor 
was derived appear to afford much ground for supposing that so high a correlation as -948 would 
be found between that factor and the first difference of national income. In his (Dr. Geary’s) 
opinion the results found were probably economically significant. 

Mr. Cohen hoped that the meeting would permit him to raise some difficulties that might be 
answered relatively easily. There did seem to be one point that had not been adequately dealt 



Discussion on Mr. Stone's Paper 


42 


[No. 1, 


with. It had seemed to him that analysis of this kind raised the q uestion whether the three factors 
were in fact independent and also whether they were really fundamental factors. ' 

For instance, in what sense could one refer to delta income as an independent factor from 
income and time; it only meant that the dependence on the factors income and time was not 
linear. It was some form of more complicated function to which we might find a closer approxi- 
rnation by bringing in the second differential of income. In other words, there was not simply a 
linear relation to income, but a more complicated mathematical function, and it might be possible 
to find a belter identification of that mathematical function than the division into these rather 
arbitrary categories. 

He also wondered in what way one could refer to income as an independent factor in respect 
to time, and whether income was not in itself a function of time, and therefore they could possibly 
be reduced to some single, more fundamental factor. 

The real point was, were not these factors simply a conglomeration of other factors? Income 
itself was merely a summation of the original items, and the fluctuations in income resulted from 
fluctuations in the original items with time. He wondered whether there could not be some more 
scientific method of determining the factors upon which the rest depended, and it might be that 
if these factors were not shown in a linear form, but in the form of algebraic or perhaps experimental 
functions, it might become a much easier process; they might reduce to two or one fundamental 
factor. 

There was one further point. It seemed unnecessary to use the surface of a sphere for what 
was after all a tw(^-dimensional diagram. A two-dimensional diagram ought to be able to be 
drawn on a plain suifacc. 


The following written contribution was received after the meeting. 

Miss N. CARRUTiirRs: I should like to draw attention to a note by Dr. C. E. P. Brooks ( 1927)* 
which outlines a simple exhaustion method of determining an approximate regression equation 
when many variates arc involved. It seems to me that this method is very similar to that used in 
factor analysis," although the method of selection of factors (or variates) diflers slightly. Both 
methods are very useful for interpolation, but they are rather dangerous as means of deducing the 
causes of fluctuations in the dependent vai*iablc(s). As Mr. Stone himself.points out, the three or 
more factors (or variates) which account for the gr eater part of the variance arc not necessarily, or 
even generally, the primary causes of fluctuation of the variable. 


Mr. SfONh said he would like to reply very briefly, and to say what a great pleasure he had had 
in listening to the comments on his paper, which had given him a great deal to think about. It 
was gettin^g late, and he would like, if he might, to think about the problems which had been 
raised at his leisure and to submit his comments in writing. It only remained for him to thank 
the Society very much tor the way in which the paper had been received. 


Mr. Stone subsequently submitted the following comments: 

I agree to a large extent with the technical parts of Mr. Champernowne’s generous opening 
to the discussion of my paper. In particular in some cases the coefficients of the three factors 
significant. The problem of testing significance is, however, especially 
clitlicult in this field, and it must be remembered that in my example the total variance of 
tne observed variates is analysed without any deduction for the error components of the 
individual senes. If, but only if, the error component of each of the series is the same 
will the pattern coefficients and the factors obtained by the method which I used be the same as 
mose which would be obtained from an analysis allowing for these error components as deductions 
Irom the leading diagonal of the matrix of sums of squares and products. But while in this case 
the results wou d be identical, their significance would depend on the importance of the error 
h?Tny^prec^ t^ms^^^^ knowledge of these it is difficult to discuss questions of significance 

It is perhaps worth noticing that at any rate in the above case, in which no allowance is made 
tor the error components in the scries, there is not in fact a simple complementarity between the 
num^r of factors needed to describe a set of variables and the number of linear regression 
equations connecting them, despite the fact that the former are associated with the larger and 
the latter with the srnaller latent roots of the fundamental determinantal equation. The reason 
seems to be that the larger factors do not pick out the systematic variation, leaving all the error 

Off.'^Prof!^ Notes, ^No^‘ 47*^^^ /^egreww/i Equations mth Many Variates. London: Air Ministry Met. 



1947] 


Discussion on Mr, Stoners Paper 


43 


to the end, and that some error is in general taken out by each factor, even the largest. T may 
mention in passing that the relationship between factor and regression analysis and the associated 
problerns of significance testing are at present being investigated at Cambridge by Dr. Geary. 

Again, while 1 find myself in substantial agreement with Mr. Champernowne in the explana- 
tion he gives of the relationship between the factors and the independently calculated valuables, 
I think that he over-estimates the inevitability of the high observed correlation between / and Fi 
due to the high intercorrelations of the primary series. In this connection it is perhaps of interest 
to set out the 136 intcrcorrelations shown in my Table II in the form of a frequency table. 


Range of correlation. Frc<|ueney. 

1(M)*8 . 30 

0-8-0-6 . 30 

0-6-0-4 . 29 

0*4-0 2 . 30 

0-2-00 . 17 


In this table the coefficients have been taken without regard to sign. It can be seen that there 
is a remarkably even spread of coefficients over the whole range, the smallest ones only being 
somewhat deficient. This table is not quite fair to Mr. Champernowne, because it is probably 
true that the variables with the largest variances are more highly intcrcorrclated than the .set of 
variables as a whole. Nevertheless 1 think there is prima facie evidence of considerable divergence 
between the scries, and do not see why a correlation of 0 *97 should be expected between / and Fj. 
Jt is, of course, clear that a fairly high correlation in some undefined sense is to be expected. 

My concluding remarks about the use of factor analysis to detect a set of inner variables were 
intended in the following sense: If no more than m factors are needed to describe a set of variates 
in a particular sample of observations, then that sample does not provide* enough information to 
test regression equations involving more than m independent variates. Jt may thus seem per- 
missible to try to simplify the original set. since a simpler one might be sufficient to reproduce the 
modes of motion observed in the sample. 

T agree with Mr. Babington Smith that there are unfortunately few degrees of freedom available 
in my example. This is a common difficulty in econometrics arising from the limited period 
for which adequate estimates arc available, and the virtual impossibility of increasing the amount 
of information by experimental methods. Nevertheless 1 believe it is instructive to discover the 
results of analysing the limited amount of material that is available. The close fit of the regression 
equations seems to me to indicate that over this limited period of observation at any rate the 
original variables showed comparatively little independent variation. 1 do not think this result 
is due to the use of covariances instead of correlations, since if this were the reason it would be 
surprising to find the small variables relatively so well explained. Nor do I think that the correla- 
tion coefficients in my Table V support the conclusion that the close fit is due to the preponderating 
effect of Fi. 

The answer to Mr. Babington Smith's second point is almost certainly no, because a period of 
stability even if it were not quite so stable as to remove all variability would alter people's responses, 
and so change the quantitative relationships between the variables. 1 do not know how far even 
the relationships themselves would be aftccted. 

On the third point, the estimates of the constituents arc independent in the sense that none 
was derived indirectly by inserting the others in an equation of delinilion. On the other hand, 
several of the series had common statistical sources, and the continuity of the methods of estima- 
tion used from year to year would ensure some correlation between the errors of successive 
observations on the same series. In other words, in all this work systematic errors enter as well 
as random eirois; indeed, the former may well be the more important. 

Finally, it is undoubtedly true that what Mr. Babington Smith called a Q analysis would 
reveal a factor of neighbourliness .between successive years, but I am not clear how far my 
can be regarded as the analogue of this factor in R analysis. May it not be that an explanation 
on the lines given by Mr. Champernowne is really the more satisfactory? In other words, docs 
not A/ come into the picture because of the time displacement of the original series relative to a 
more or less common mode of variation reflecting the state of business activity or the national 
income? 

It is encouraging to have approval of the technique adopted expressed by so high an authority 
as Sir Cyril Burt, particularly since the precise method is apparently not used much by psycholo- 
gists. The length of the computations involved may well be a reason for not using Hotelling’s 
method, but it can hardly be a reason for preferring correlations to covariances, since the use of 



44 


Discussion on Mr. Stoners Paper 


[No. 1, 


the latter actually simplifies the calculations. The explanation would seem to lie in the lack of a 
common unit for psychological tests similar to the dollar unit in my example. * Of course the 
common unit of money might not be available in economic applications, as would occur for 
example if prices, interest rates, etc., were included among the variables to be analysed. Further- 
more, if we were considering theories involving the analogue of trigger mechanisms, we might 
lose sight of the most significant features of the system by adopting money as a common unit 
and concentrating on sums of squares and products. 

Mr. Barna seems to me to be too sweeping in his condemnation of multiple regression npiethods 
as applied to economic time series. There are certainly great difficulties, but 1 do not think my 
present paper does anything to add to them since it is hardly concerned with sarnpling problems 
at all. Without going over the ground of my earlier paper on demand analysis, it may be useful 
to indicate the connection between regression analysis and factor analysis by means of an example. 

1 shall take as an illustration the p ^ 4 series in my earlier analysis of spirits consumption. 
Suppose these are factorized completely using correlations since there is no cornmon unit and, 
as in the present paper, making no allowance for errors in the series. Now if we make no 
assumptions about the error components we can come to no conclusions about the number of 
significant relationships connecting the variates; the factorization is simply an arithmetical 
process. Since 1 have no estimates of the variances, of the error components of the four 
series, T shall assume for the purposes of this illustration that they are all equal. This assumption 
leaves the ratios of the latent roots, X, unaffected and makes it possible, using some of Tintner’s 
results, to discover the minimum common error variance, K compatible with different assumptions 
about the number of relationships, R, assumed to connect the four variates. For, as Tintner 
has shown, if N is the number of sets of observations, then (N - \)V~^ times the (/? 4; 1) =7 r 
smallest X’s must be equal to or greater than the value of x* at some accepted critical point with 
{N X -pH r)r degrees of freedom. 

In the above example N 19, p = 4 and r may range from 1 to 4. The following table brings 
together the relevant calculations. stands for the critical (5 per cent.) point of 



n ;> 1{ 

(.V 1) i A 

8 p 

(xV - L p 

+ r) r 

, f, p - ji 

^ (niftx.) 

1 

0126 

15 

25 

200 

0-005 

2 

1 062 

32 

45- 

42 

0-024 

3 

8-892 

51 

72 

8 

0-125 

4 

72 000 

72 

97 

1-35 

0-741 

In fact it 

is reasonable to expect that R 

1, i.e. that there is one and only one relationship. 


the demand relationship connecting the four variables. From the table we see that for r -= 2 
the maximum value of V is 0 024, i.e. that if our assumption is true, at most 2i per cent, of the 
variances of the four variables is ascribable to error. This seems a reasonable figure. The 
other values of K in the table on the other hand do not seem reasonable. With i? 0, F is 
absurdly small, i.e. we must have very high expectations of the accuracy of our series if we are 
not satisfied with the fit we get, while with R > 1, F is almost impossibly large, i.e. the series can 
hardly be so inaccurate that we could be satisfied with the fit of more than one equation connecting 
them or any subset of them. This result agrees with my original analysis based on bunch maps, 
though there arc a great many assumptions involved. 

The point of this example is to show that before we start talking about significance we must 
introduce the idea of error components. I have not done this in my present paper, and do not 
think therefore that my results provide any clue to the interpretation of other analyses in terms 
of statistical significance. 

I agree entirely with Mr. Barna about the use of information other than that derived from 
time series. The cjise he mentions, budget data, is not as simple as it looks, however, because 
these data are rarely tabulated, so that the other determining variables, family size, age of head 
of household, etc., can be inserted in the equation used to explain consumption. In such a case 
it may easily happen that the simple regression between consumption and income is highly mis- 
leading, as has been shown by Haavelmo. 

The dangers in excluding the size of the sample due to institutional changes can to some extent 
be met by a device often employed by Tinbergen of inserting a variable with the value 0 up to 
the date of the change and with the value 1 after that date which will catch any simple discon- 
tinuity of the type to which Mr. Barna referred. More complicated cases must in principle 



1947] 


Discussion on Mr. Stoners Paper 


45 


involve setting up a system of equations and studying the effect on the system when one or more 
of these or their coefficients are changed. 

I was interested in Mr. Rao’s contribution to the dispussion, but will confine myself to one 
point. In my exarhple it happened that the scries which 1 expected to be capable of describing 
the original data were in fact able to do so. Had they not been, 1 could have used the factor 
analysis to investigate other possibilities. 

1 agree with Mr. Herdan’s main contention, which is, I think, closely connected with the views 
expressed above in commenting on the remarks of Mr. Champernowne and Mr. Barna. I do 
not know how to interpret the residuals from an analysis into Hotelling factors, and tried to 
make this clear in my paper. 

Mr. Cohen’s comment raises a fundamental question, but one which is a little beyond the 
limited scope of my paper. I think it is helpful to express certain variables in terms of others 
without necessarily having a complete theory of the interrelationships of the latter. 



46 


Finney — Principles of Biological Assay 


[No. 1, 


The Principj.es of Biological Assay 
By D. J. Finney 

Lecturer in the Design and Analysis of Scientific Experiment, University of Oxford 

[Read lieforc the Research Seci ion of the Royal States eical Society, January 30th, 1947, 

Dr. J. Wishart in the Chair] 

1. Introduction 

“Biologic AL assay, as carried out by the majority of workers in the world, still remains 
a subject for amusement or despair, rather than for satisfaction or self-respect. We have cat 
units, rabbit units, rat units, mouse units, dog units, and, latest addition of all, pigeon units. 
The field of tame laboratory animals having been ncarl; exhausted, it remains for the bolder 
spirits to discover methods in which a lion or elephant unit may be described.” So wrote 
Professor J. H. Burn (1930), himself one of the leading British exponents of methods of biological 
standardization. He went on to explain that the fault lay not with the concept of biological 
assay, but with those users of it who “arc still ignorant of certain principles which during the past 
few years have been shown to be capable of transforming this whole subject from the plane of 
an insidious means of self-deception to that of a well-ordered and progressive science.” The 
pioneer work of Bliss, Fieller, Fisher, Gaddum, Irwin, Tattcrsfield, Trevan, to name only a few of 
those who have contributed to the development of the theory and practice of biological assay, has 
entirely changed the situation. Biological techniques are now widely accepted, not merely 
as unavoidable evils, but as sound methods for many types of estimation that cannot easily be 
made by direct chcmicai or physical analysis. Indeed, when both chemical and biological 
methods are available for the same estimation, the generally greater precision of the former 
does not necessarily make it the one to be preferred; if the final result is required in terms of 
the biological potency of a complex material, the greater specificity of a biological assay may give 
it an advantage over a laborious chemical procedure (Bacharach, 1945). 

When 1 was asked to prepare a paper on biological assay for the Research Section, I had 
to bear in mind that in recent years the Section has already received two papers on this subject. 
Dr. Irwin, in 1937, gave a comprehensive survey of the application of statistical methods to 
various problems of biological assay, and Mr. Fieller, in 1940, gave a paper which, though 
primarily concerned with the assay of insulin, described aspects of assay design and analysis 
useful for many other purposes. Since these papers were written, there have been further 
developments in the use of statistical methods for assays, but mostly in matters of detail rather 
than in fundamentally new ideas; an account of them might interest and help biologists, but to 
statisticians would perhaps seem a statement of the obvious. Instead, I have chosen this 
evening to review the statistico-mathematical principles underlying a good biological assay, and 
to show the essential unity of the dilTerent types of statistical analysis commonly used for 
assay data. Though biologists have enunciated the biological conditions requisite 
for a sound assay, so far as I can discover statisticians have never explicitly stated their 
conditions; a paper by Bliss (1940) goes some way towards this, but he does not there discuss 
statistical analysis in detail. The requirements of the biologist and of the statistician are in the 
main merely different aspects of the same truth, but more frequent attention to the statistical 
formulation might prevent some of the loose thinking that often occurs in discussions of assays. 
In order to avoid misunderstanding, I should point out that this paper has been written primarily 
for statisticians, apd I have stressed the matters which seem to me of greatest statistical interest; 
had I been writing for an audience of biochemists or pharmacologists, or of those who must 
base commercial policy and actions on the findings of biologists and statisticians, though the 
mathematics would be unaltered I might have placed my emphasis rather differently. 

What is a biological assay ? In an excellent review of recent literature. Bliss and Cattell 
(1943) say: “Biological assays may be defined as determinations of potency or toxicity based 



1947] 


Finney — Principles of Biological Assay 


47 


upon the reaction of living matter, including biological reactions not involving intact cells, such 
as serological tests in vitro^ Thus a comparison of the relative potencies of two samples of 
insulin by means of the changes they produce in the blood sugar of rabbits is a biological assay ; 
the emphasis must be placed on the estimation of potency, and a similar experiment in which 
the chief interest lay in the effect of insulin on the rabbit would not be an assay. Again, an 
experiment on the increase in potato yield caused by nitrogenous fertilizer would not generally 
be considered an assay; nevertheless, if the data were to be used for estimating the amount of 
a standard sulphate of ammonia equivalent to one unit of another fertilizer, the experiment 
would then come within the terms of the definition just given. The typical assay involves two 
components, a stimulus (e.g. a drug, a poison, a vitamin) and a subject (e.g. an animal, a plant, 
a bacterial culture). The potency of the stimulus is assessed by means of the effect it produces 
on a selected measurement of the subject (e.g. weight of subject or of a particular organ, per- 
centage blood sugar, or a quantal “measurement” such as death); this measurement is the 
response, and, before a particular reaction of stimulus and subject can be made the basis of an 
assay, something must be known of the relationship between the dose of stimulus and the magni- 
tude of the response. A stimulus can then be assayed either absolutely from a known dose- 
response curve, or, more commonly, by the comparison of doses of difterent stimuli which 
produce equal responses. Failure to distinguish between the needs of the preliminary investi- 
gation of the dose-response law and those of an assay to be used as a routine analytical method 
has sometimes confused the discussion of efficiency in assay design. 

In the paper from which I have already quoted. Burn stated five principles of a good assay 
technique; he was writing as a biologist, but his conclusions arc closely related to those I intend 
to put before you as the statistician's principles. Firstly he emphasized the much greater 
reliability of potency estimates obtained by comparison of the material to be assayed with 
a standard, as against attempts to measure absolute potency in “animal units,” and the 
need for taking due account of variations between individual test subjects. In this 
he followed Trevan's (1927) demonstration of the misleading deductions that may be drawn 
from tests on small numbers of subjects unless sound statistical techniques are used. He then 
spcciffed conditions which must be satisfied by the dose-response relationship if a valid assay is 
to be based upon it; these are biological versions of what, in the next section, 1 call Conditions I 
and 11. I propose this evening to state these principles in statistical terms, then to show general 
methods for the estimation of relative potency and to summarize those at present in use for 
quantitative and quantal responses, and finally to discuss the planning of assays for obtaining 
potency estimates of maximum precision. 


2. The Relationship of Dose and Response 

The stimulus in any assay will be a dose of either a standard or a “test” preparation containing 
the substance to be assayed — for example, a dose containing either a known amount of nicotinic 
acid or a known amount of a food-stuff' whose (unknown) content of the vitamin is to be assayed. 
A biologically valid assay must be such “that the response supposed to be produced by the 
known amounts of ‘factor X' is actually due to the factor itself and not to some other substance 
associated with it, e.g. an impurity; and that the response produced by the material 
to be analysed (assayed) is also due solely to the presence in it of ‘factor A',’ without augmentation, 
diminution, or modification by any other substance also present. In other words, if we use the 
terms ‘Standard Preparation’ and ‘Test Preparation’ to denote respectively the solution of 
allegedly pure ‘factor X' and the solution prepared from the material to be analysed, we assume 
that the Std. Prep, contains no substance, other than factor X itself, contributing to the response 
we measure, and that the Test Prep, behaves for the purposes of the analysis so similarly to the 
Std. Prep, that it may be regarded simply as a dilution of the Std. Prep, in a completely inert 
diluent” (Wood, 1946^). No statistical process can distinguish between Wood’s “ factor 
A'” and a chemically different substance which affects the measured response in the 
same manner— a self-evident truth which nevertheless is sometimes forgotten. On the other 
hand, the statistician must examine the second part of Wood’s statement, his “hypothesis 
of similarity,” which is equivalent to Condition II below. 



48 Finney — Principles of Biological Assay [No, 1, 

Suppose that a test subject is given a dose z of a stimulus and that, when the dose has had 

time to produce its effect, a response u is measured on the subject. For a given 

population of subjects, tested under constant experimental conditions, the expected response 
to any dose may be written — 

E{u^ - 4 /, ( 1 ) 

where U is a regression function of the form — 

U ^ F(z) ( 2 ) 

If this relationship is to be made the basis of an assay, V must be a single-valued strictly mono- 
tonic function of z, at least over the range of doses to be used (Condition I); the function need 
not be continuous, though in practice it will usually be so. Without loss of generality, U may 
be taken as an increasing function of z* 

If a standard and a test preparation both satisfy this condition, and have regression 
functions — 

u, ~~ /; (z), (3) 

Ut - Ft iz) (4) 

then, tor any selected value of U within the range of validity, doses z„ Zt can be uniquely deter- 
mined such that the expected values of u for these amounts of the two preparations are both 
U. Hence, at this level of response, the second material may be said to have zjzt times the 
potency of the first. 

If the two preparations each contain the same effective constituent (or the same effective 
constituents, in constant proportions) and all other constituents are without effect on U, so 
that the less potent preparation behaves as though it were a dilution of the other in a completely 
inert diluent, this ratio must be independent of C, for it represents the relative amounts of the 
effective constituent in equal doses (Condition II). Equation (4) can then be written — 


Vt -- F, (pz). 


where p, a constant, is the potency of the second preparation relative to the fir.st. 

The fundamental assay problem may thus be stated as follows: A standard preparation, 
whose content of the effective constituent either is known or is defined in terms of arbitrary 
units, has a dose-response regression function, 

F{z), (5) 

which is subject to Condition I above. A test preparation is known to have the same effective 
constituent, and the amount present in unit dose is to be estimated, making use of the fact that 
for this preparation-- 


(/i=-F(pz). . (6) 

The required content of the effective constituent per unit dose is p times the content per unit 
of the standard preparation, where p is a parameter which has to be estimated from the measured 
responses of a sample of test subjects. 

If the data cannot adequately be described by the same form of F( ) for both preparations, the 
basic assumption that only the same effective constituents were concerned in both must be false, 
unless the conditions of testing of the two preparations were not maintained sufficiently nearly 
constant. The assay is therefore invalid, cither because of the inherent incommensurability of the 
substances or because of the non-comparability of the tests. Except for sampling variation,' all 

* It is conceivable that an assay technique might be based on a regression function not satisfying 
Condition 1 ; even a function.showing maxima and minima might be fitted to the data for the two prepara- 
tions, and, providing that Condition 11 were satisfied, a relative potency could be determined as the 
magnification of the one curve relative to the other, though there would not necessarily be a unique dose 
of the one preparation of potency equivalent to a specified dose of the other. This situation seems scarcely 
likely to be of practical importance. 



1947J 


Finney — Principles of Biological Assay 


49 


valid assay procedures for the same test preparation should give the same estimate of potency, 
irrespective of the test subject, the measurement chosen as the response, and the conditions of 
testing. 

On the other hand, the fact that tests of two preparations give responses whose measurements 
can be satisfactorily fitted by equations such as (5) and (6) is in no way a demonstration that they 
have a common effective constituent. Various toxic compounds found in derris root (rotcnone, 
elliptone, toxicarol, etc.) have shown constant relative potencies in insecticidal sprays used against 
Aphis rumicis (Martin, 1940, 1942), yet they are of different (though related) chemical structure. 
Even preparations of widely different chemical composition may, under particular conditions of 
test, act in conformity with equations (5) and (6) ; their relative potencies might still be useful as an 
expression of their behaviour under these conditions, but would not necessarily have any wider 
applicability. Wood (19446) has drawn attention to the difference between these two types of 
assay, with special reference to the choice of experimental design. 

In most assays, though the form of the function F {) may be known, its specification will 
contain one or more parameters which have also to be estimated from the data. The experimenter 
would often like to determine a dose-response curve for his standard preparation and assay 
subsequent test preparations by reference to this curve. This procedure appears to have the 
considerable advantage that the standard curve could be determined with high precision, and in 
effect only the results for the test preparation would have an appreciable error. Unfortunately, 
the standard curve for many stimuli and test subjects does not remain constant in position, or 
even in slope and other characteristics of shape, but varies from one assay to another, probably 
because of uncontrollable differences in the experimental conditions and in the laboratory stock 
of test subjects. Bliss and Packaid (1941) have described one example of a dose-response relation- 
ship which remained constant over a long period, namely the mortality of eggs of Drosophila 
melanogaster exposed to Roentgen rays ; biological assays of this type of radiation might be based 
satisfactorily on a standard curve. But such stability of the standard curve is exceptional, and 
generally an assay procedure must involve simultaneous or consecutive tests of the standard 
and test preparations, in order that the results may be truly comparable. The optimal distribution 
of labour between the two will be considered in Section 7. 

The theoretical basis of the statistical analysis when the parameters of F ( ) are known need 
not be discussed separately from the more usual situation of unknown parameters, of which it is 
mathematically a degenerate case. The most important class of dose-response relationships 
comprises those which may be written in a modified form of (5) — 

(/, -/(a -I- (7) 

where f ( ) is again a function satisfying Condition I, and one whose parameters are all known, 
X is a known parameter, and a, p require to be estimated from experimental data. As a limiting 
form of (7), corresponding to X = 0, 

--/(a h Plogz) (8) 

may be included. A new scale of measurement of dose may be defined by transformation of the 
original z to — • 

X ^ (9) 

or, for equation (8), 

X -- log z, (10) 

so that 

i/, -/(a + P^); (11) 

Finney (1947) has called such a transformed measure of dose the dosage, but perhaps the term 
dose metameter (Bacharach et aL, 1942) is less likely to prove confusing. If now a response 
metameter is defined by — 

(12) 


equationXl 1) may be re-written — 

y, = a -h Pa: (13) 

SUPP. VOL. IX. NO. 1. E 



50 


Finney — Principles of Biological Assay 


[No. 1 


When equation (6) is written in terms of the same metameters, it becomes — 

Yi =- a -I- (X 0), (14.1) 

or y ^ a -{- P log p 4" P /^ (^ “= 0) . . . . (14.2) 

In this way the two dose-response relationships are represented as straight lines. If X 0, the 
two lines intersect at x 0, and the relative potency is the X-th root of the ratio of the slopes of 
the two lines. If X = 0, the two lines are parallel, and the logarithm of the relative potency is the 
distance between them measured parallel to the .x^-axis. Whatever the value of X, the most con- 
venient method of estimating p is first to determine the two best-fitting straight lines in the (jf, y) 
plane, subject to the constraint either of equality at x -= 0, or of parallelism, and then to calculate 
R, the estimate of p, as the appropriate function of the ratio of slopes or of the distance between 
the lines. The acceptance of a standard dose-response curve determined with high precision 
from previous data, as in the example quoted from Bliss and Packard, imposes the additional 
restriction on equation (14) that a, p arc already known and the only parameter left to be estimated 
is p; intermediate cases in which ope of a and p is already known may be recognized, but are 
not of great practical importance. 

A more general problem would, of course, be that which required X also to be estimated from 
the data of each assay. The maximum likelihood equations then become much more complicated 
than those developed in Section 3 ; an example of the computations has been given by Finney 
(1947). Fortunately in many types of assay there is considerable evidence cither that X 1 or 
that X - 0 over a wide range of doses, and that departures from the usual value are indications 
of an unreliable assay rather than of true variations in X. Preliminary investigations will often 
establish some simple value of X as the normal for a particular assay technique, after which this 
value may be adopted for future routine assays unless the accumulation of further data indicates 
this assumption to be unjustifiable. 

3. Eshmation of Relative Potency 

When the dose-response relationship is of the type represented by equations (7) or (8), the 
relative potency of two preparations may be estimated from measurement of the response of test 
subjects at several dose levels of each preparation. If no assessment of the precision of the 
estimate is required, a graphical estimation procedure may be sufficiently exact. For the pre- 
liminary examination of results, and for sonic routine purposes when the consequences of occa- 
sional mis-statements of potency are not very serious, possibly nothing more than rapid graphical 
analysis will be needed ; this is also often required as a first step towards a more complete analysis. 
In the graphical method each observed response, w, is transformed to a response metameter, y, 
according to equation (12); y is then plotted against the dose metameter, x. If jc is a power of 
the dose, as in equation (9), the series of points for the two preparations should lie on two straight 
lines constrained to have the same value at x 0 ; these lines may be drawn by eye and an estimate, 
/?, of the relative potency obtained by equating the ratio of their slopes to R^. If a* is the logarithm 
of the dose, as in equation (10), the two .series of points should lie on two parallel lines; again the 
lines may be drawn by eye, and R is then given by equating the horizontal distance between the 
lines to log R. 

With experience, drawing these lines by eye can be made to give estimates of potency very little 
different from those obtained by computation, especially if some regard is paid to the major 
differences in the weights to be attached to the points (see below). For most types of assay, 
however, the precision of the potency estimate has to be assessed before important decisions 
relating to the material assayed can be taken. Though calculations of precision could be based 
upon measurements of individual deviations from lines fitted by eye, such a course has little, if any, 
advantage over a complete arithmetical analysis of the kind now to be described. 

This section is written primarily in relation to responses which are quantitative measurements; 
the similar methods for quantal responses will form the subject of Section 5. Before a full analysis 
can be begun, some assumption must be made about the variation in response shown by test 
subjects receiving the same dose. Individual responses, w, may often be assumed normally 
distributed about their expected value, U; as is well known, small departures from normality 



1947] 


Finney — Principles of Biological Assay 


51 


seldom have any; serious effect, especially when (as here) the most important conclusions will be 
based upon means, the distributions of which will approach more closely to the normal form 
than those of the original observations. Furthermore, u may be assumed to be measured on a 
scale on which its variance is, practically speaking, independent of U ; some simple preliminary* 
transformation may be necessary to ensure this, and will not seriously interfere with the normality 
condition.* The variance per subject, can then be assessed from the sum of squares of devia- 
tions of // “within dose-groups.*' 

The method of maximum likelihood (Fisher, 1922, 1925), here equivalent to the method of 
least squares, may be applied to the estimation of the unknown parameters, a and p, in the dose- 
response relationship for a single preparation. The estimates are determined from the data so as 
to minimize the quantity — 

L^^^S{u-iJY (15) 


with respect to variations in a and p. Suppose that have been obtained, from a diagram or 
in any other way, as first approximations to the estimates. Then by the Taylor-Maciaurin expan- 
sion, to the first order of small quantities, improved values will be ~ -f- ^b^, 

where the increments 8/)^ are the solutions of 




aa 










(16) 


In these equations, the addition of the suffix to a, p is intended to indicate that after differentiation 
the first approximations arc substituted. The three second-order differential coefficients may be 
replaced by their expected values, so that the equations become 



where f/j — {a ^ -1- x). By the use of equation (12), and introducing the weighting coefficient 

(18) 

these equations may be rewritten 

Sw^ I- \ 

wx(ii-uy • * * 

Swj^x -f 8by^ Y ) ^ ^ 

If now a working response, y, is defined by 

y ^ Y (u-V)ir(Y) (20) 

♦ Mr. E. C. Fieller has kindly shown me the typescript of an interesting paper (Fieller, 1947) in which 
he emphasizes the importance of a transformation which will equalize variances. Unfortunately 1 have 
seen his paper too late to be able to take full account of it in my own. Use of this transformation does 
not seem to be inconsistent with the method of analysis given here, for it amounts only to a definition of the 
scale in which the responses are considered to be measured. The further transformation to a respon^ 
metameter, equation (12), is then adopted as an aid to computation, but, since compensatory changes will 
be made in the weights attached to observations, the estimates eventually obtained will be unaltered ; the 
second transformation, in fact, is no more than a device which helps forward the systematic solution of 
the maximum likelihood equations, and which happens also to give a convenient linear representation of 
the data. 



52 Finney — Principles of Biological Assay (No. 1, 

equations (19) show that the weighted linear regression of the working response calculated 
from the first approximation, on x is 

^2 == ^2 + (20 

a new approximation to the maximum likelihood estimate of the dose-response relationship ; the 
weight of each observation in the regression calculations is the appropriate 

The procedure for the estimation of a and p from experimental data may thus be put into the 
form of a successive-approximation method ; the outline of this method was first given, for a 
special case, by Fisher (1935), and the theory was developed more fully by Garwood (1940). When 
the same response transformation has to be used often, the computations can be expedited by 
preparing a special set of tables. Firstly a table of T as a function of U and secondly tables of 
the weighting coefficient w, the minimum working response ^ and the ra/ige, \lf\Y)y 

in terms of the argument T, are required; the names are given as analogues to the terminology 
of probit analysis, though the one is not a true minimum if negative responses can occur and the 
other is not a range in any obvious sense. The first step in the calculation is to read from the first 
table the values of y corresponding to each measured w, and to plot these against the dose meta- 
meter Xs On this diagram a provisional estimate of the line (13) is then drawn by eye and the 
“expected” values of Y corresponding to each dose determined from the line. To the minimum 
working response corresponding with each Y is added the product of the range with the observed 
u\ the resulting working response, >•, is assigned the weight, h*, appropriate to the value of T, 
and the weighted linear regression of y on is calculated in the ordinary manner. Tf this line 
differs appreciably from the provisional line, it may be used as a new “provisional” and the cycle 
of computations rcp)eatcd. 

The procedure requires very little adaptation for use in an assay. The empirical results for 
tests at various dose levels of the standard and test preparations are plotted in the {x, >•) plane. 
Two provisional lines are then drawn by eye, subject to the condition that they shall intersect 
at A' 0 if the dose metameter is x ^ or that they shall be parallel if the dose metameter is 
X - log z. Working responses and weights are found just as for the single line, and improved 
approximations to the parameters obtained from the two weighted linear regressions of on a ; 
the regressions are now calculated subject to the condition cither of intersection at jc ^ 0 or of 
parallelism. As in the purely graphical process, the estimate of relative potency is eventually 
formed either from the ratio of slopes of two lines or from the horizontal distance between two 
parallel lines. 

In the sense in which the term has so far been used, a biological assay is a method of estimating 
the composition of the test preparation; apart from sampling errors, the same result should be 
attained whatever experimental technique is used, independently of choice of test subject and 
measurement, provided only that the measurement made satisfies Conditions I and IT. Bliss and 
CatlelTs definition also includes assays of relative potency of, for example, two different methods 
of application of a poison in which the result need not be independent of the test subject or of the 
technique chosen. Thus Tattersfield and Potter (1943) found the relative potencies of two different 
ways of applying a rotenone spray to the chrysanthemum aphis to be different when a fixed spray 
deposit was applied at different rotenone concentrations from what it was when a fixed concen- 
tration was applied at different amounts of deposit. Even in these circumstances, experimental 
results may usefully be summarized in terms of relative potencies, and the statistical procedure 
will be the same as that needed for the analytical type of assay. The numerical example in Section 4 
is an analytical assay, that in Section 5 is an assay only in the wider sense. 


4. Quantitative Responses 

When the response is a quantitative measurement on individual subjects, experience has shown 
that for a wide variety of circumstances the response itself is linearly related either to the logarithm 
of the dose or to a power of the dose (usually the first power). If the response metameter is identical 
with the response. 


/(n- TandAK)- 1, 



1947 ] 


Finney — Principles of Biological Assay 


53 


so that the weighting coefficient is unity and the working response is the same as the empirical 
response. There is then no need for any successive approximation process, and each observation 
is given unit weight in the calculation of regressions of y on x for each preparation. 

Assays in which the response is linearly related to the logarithm of the dose are the commonest 
type in use to-day. Such a relationship can hold only over a limited range of responses, and must 
exclude zero dose. Irwin (1937) and Fieller (1^40) have already shown this Society examples of 
their use, and they are now widely applied to the assay of vitamins, drugs, hormones, and other 
materials. By ordinary regression technique, two parallel straight lines are fitted to the data : 


-f bxsy ^ 
Yt ^ at + bxt. ^ 


. ( 22 ) 


Comparison with the “expected equations’* (13), (14.1) then gives R as the estimate of p, where 


log R - M - (at - as)!b (23) 


Even when the main analysis is to be arithmetical rather than graphical, a rough diagram of the 
dose-response relationship for both preparations is often valuable, as it may reveal irregularities 
or trends which would escape notice in an uncritical application of routine arithmetic. 

The calculations are often most conveniently carried out with the aid of an analysis of variance ; 
in order that the assay may be valid there must be no indication of departure from linearity for 
either preparation (as evidenced by comparing the residual variation of the mean respbnses for 
each dose with the variation amongst individual responses for a fixed dose) and no indication of 
a departure from parallelism. If .v^ is the “within doses” variance obtained in this analysis, the 
variance of M may for some purposes be taken as 


2 


s 


M 


V(M) 


‘'Nt ■■ . 


. (24) 


when N«, are the total numbers of observations for the standard and test preparations, Xf, and 
Xt the weighted means of jc, and S^rz Is the sum of squares of deviations for x used in the calculation 
of the common regression coefficient, b. Strictly speaking, the variance of a ratio of two normally * 
distributed quantities is infinite, and equation (24) is an approximation of limited usefulness ; 
it can be used for assigning fiducial limits to Af only if b is large compared with its standard error 
at the level of probability chosen. The variance of b is 


V{b) - sVSj,,; (25) 

if t is the deviate for the chosen probability level, with the number of degrees of freedom on which 
is based, and 


g -= S^tVh^Srz, 

the exact fiducial limits may be found as 




«•) 




1 \ 

nJ ““ S„ 


(26) 

(27) 


which tend to M± SMt when g is small. This result is analogous to one developed by Bliss (19356) 
for the fiducial limits of the median lethal dose in the analysis of quantal response data; the 
extension and general application ’to the assessment of precision in all types of assay seems first 
to have been appreciated by Fieller (1940). Conclusions based on equation (24) may be very 
misleading unless the condition that g shall be small is satisfied, and users of the assay technique 
should always assess their limits of error by means of (27) except when they are certain that the 
simpler calculations will lead to essentially the same results. No examples of the familiar pattern 
of computations for this type of quantitative response assay need be given here. 

In recent years microbiological techniques have been developed for the assay of vitamins and 
other substances. The test subject is then no longer a single animal, but is a tube containing a 
basal medium plus a dose of the standard or test preparation. The tube is sterilized and, after 



54 


Finney — Principles of Biological Assay 


[No.l, 


the addition of a fixed amount of a standard bacterial culture, is incubated under standard con- 
ditions of time and temperature. The bacterial growth in the tube is then assessed by a measure- 
ment of acidity, turbidity, or other quantity closely correlated with growth; this measurement 
constitutes the response. 

Experience has shown that, with a carefully controlled biological technique and a suitable 
basal medium, some types of micro-biological assay usually show a linear relationship between 
the response and dose from zero dose up to a moderately large quantity, though the addition to 
the basal medium of a very small amount of the factor under assay may be necessary to ensure 
linearity (Wood, 1946^^). For other assay techniques more complex relationships occur. These 
have not yet been studied at all completely ; some appear to fall within the class defined by equation 
(II), and linear relationships between the logarithm of the response and the logarithm of the dose 
have recently been reported (Wood, 1946/)). Once the form of the response and dose transfor- 
mations have been settled, the general theory of Section 3 may be applied. For practical purposes, 
in this and other types of assay, the correctly weighted full analysis may not always be needed. 
If the data agree well with a linear relationship between dose and response metameters, the fitting 
of unweighted linear regressions will scarcely affect the estimate of potency; estimates of variance, 
fiducial limits, and tests of assay validity, however, may be more seriously affected. The merits 
and demerits of this simplification in the arithmetic will need to be examined more closely 
when a greater amount of evidence on the form of transformations to response metameters has 
accumulated. 

Only the simple form of analysis, in which no transformation of dose or response is made, 
will be discussed in detail here. The responses obtained with a series of doses of the two prepara- 
tions have then to be fitted by the equations : 


Vg ^ a bgXgf ) 

y, - • a 4 biXt. ' 


. (28) 


“Blank” or zero-dose tubes usually will be included in the test, and these give a direct estimate 
of a. Comparison with equations (13), (14.2) shows the estimate of p to be 


R = btlh, (29) 

The statistical technique for dealing with data from this type of assay has been described by 
Finney (1945); more detailed formulae have been given by Bliss (1946) and by Wood and Finney 
(1946). The appropriate procedure may be considered as the fitting of a bivariate regression 
equation 

y 4/ + bgXg btxt (30) 

to the whole of the data, every observation having either 0 or Xt = 0. Comparison of the 
residual variation “between doses,” after elimination of the regression component, with the 
variance “within doses,” again in an analysis of variance, gives a comprehensive significance test 
of the deviations from linearity and hence of the validity of the assay. The computations arc 
illustrated in the following example. 

Example: Kent-Jones and Meiklejohn (1944) have given the detailed results of an assay of 
nicotinic acid in a meat extract. Five concentrations of a standard nicotinic acid preparation and 
three of a solution prepared from the meat extract were inoculated (in duplicate tubes) with a 
standard culture of Lactobacillus arabinosus; two blank tubes were also inoculated. After incu- 
bation at 37 ' C. for 72 hours, each tube was titrated with N/14 sodium hydroxide to give a measure 
of acidity. The results of the titrations are shown in Table 1, and a diagram (Fig. 1) shows the 
two series of points to lie nearly on two straight lines. 

The normal equations for bi are found in the usual manner as 

0 \5bg ~ OnSbt - 2-34, 

~0-75/)« + 10-00/)t == 7-65; 



1947] 


Fwney— P rinciples of Biological Assay 


55 


Table 1 

Assay of Nicotinic Acid in Meat Extract 


(Data of Kent-Jones and Meiklejohn) 


Mg. Nict)tiuic acid 

ml. NaOH 

ml. polutlon 

ml NaOH 

l)er tube 

(f/h 


per tube (xp. 

(y). 

05 

3-5, 

3-2 

10 

4-9, 4-8 

•10 

5-0, 

4-7 



•15 

6-2, 

6-1 

1-5 

6-3, 6-5 

•20 

8-0, 

7-7 



•25 

9-4, 

9-5 

20 

7-7, 7 7 


Blank tubes (at, = at, = 0) ; 1 -5, 1 4 ml. NaOH. 



Fig. J. 

Response diagram for assay of nicotinic acid in a meat extract. 


the numerical values here are e^act. The inverse matrix of the coefficients, to be used also in 
obtaining the variance of the estimated potency, is 

^ / I0-6 0-8 \ 

VO-8 016r 

whence 

b, - 31-08, 
bt = 3096. 


and therefore 


R = 00996. 



56 


Finney — Principles of Biological Assay 


[No. 1, 


This value of R is expressed as (xg. nicotinic acid per ml. of test preparation, and, as 1 gram 
of the original meat extract was contained in 5000 ml. of the solution, the potency of the extract 
is estimated to be 498 (xg. per g. 

Table 2 

Analysis of Variance for Nicotinic Acid Assay 

d.f. Sum of squares. 

Linear regression . . 2 . 96^412 

Deviations from linearity 6 . 0*278 

Between doses ... 8 . 96*690 

Within doses ... 9 0*175 . 0*0194 

Total . .17 . 96*865 

Table 2 shows the analysis of variance for the.se data: the mean square for deviations from 
linearity, though larger than the error mean square, does not indicate any significant non-linearity. 
The error mean square is 

- 0*0194, 

and the product of this with the elements of V gives the variances and covariance of bg, bt. 
Providing that bg is large in comparison with its standard error, a standard error for R may be 
derived from 

^ ^ F («) == j Vibi) - 2R Cov (b„ bi) + (ft.) I . . . (31) 

00194 X (01600 01594 -|- 01058)/966 

- 000000214 

- (0 00146)". 

The estimated potency is therefore 498 i 7*3 (xg. per g. 

Jf the elements of the matrix V are written as Vj,, v^, r, the general expression for the fiducial 
limits of 7?, analogous to (27), may be derived from Fieller’s (1944) formula and written 

^ h. n - ^)^/ + R‘y,~gyv,- "j, . . . (32) 

g - s^Vgt^lK (33) 

In a good microbiological assay there should be relatively much less variation between replicate 
tubes than is found between animals in a macro-biological assay. Consequently g is usually 
negligible, so that (32) reduces to /? ± 5 h /. In the example just considered, the 5 per cent, level 
of t is 2*26, so that g -- 0*001 1, a negligible quantity, and the fiducial limits to the estimated potency 
of the extract are 482 and 514 (xg. per g. 

If the dose response relationship cannot be put into the form of equation (11), the assay problem 
is much more difficult. Little use has been made of polynomial relationships such as 

6/ a -f + 7 .X* + , 

or of regression equations of the form 

C/ = a(l — 

since for them there is no simple arithmetical procedure. For the quadratic, indeed, 

(7 ^ a 4- pAT + 7 Jr*, 



Moan square. 

0*0463 



1947] 


Finney — Principles of Biological Assay 


57 


where x = log z, an unbiased estimate of log p can be obtained from a symmetrical four-point 
assay (two doses of each preparation, at equal intervals of x, and equal numbers of subjects in 
all four groups) by ignoring the quadratic term ; the estimate is the same as that derived from the 
fitting of two parallel lines (Gridgeman, 1943 ; Wood, 1944^). But for a greater number of dose 
levels, the fitting of pairs of equations of the form 


Ug a + bx cx*, 

Ut ~ a -{■ h(x'~\- M) + c (x 4- MY, 

or, if a: = 

Ug ^ a hx ^•A:^ j 

= a + 6/?^ a: + 

would involve a very troublesome set of computations. 


. (34) 


. (35) 


Emmens (1940) has proposed the use of the logistic function for assays based on quantitative 
responses, when the range of responses is too great for a linear relationship between V and -v to be 
a satisfactory representation of the data. He has found this function to agree well with obser- 
vation in a number of assays. His equation is equivalent to 

U ^ 4“ tanh (a -f- Pa-)|, (36) 

where x is any suitable dose metameter; this represents a mean response which approaches 
asymptotically the maximum value H. If // is known, or can be reliably estimated from previous 
work, the transformation 

r-uah-.(“r«) ,3„ 

can be used to reduce the relationship to the form of equation (13). The general procedure can 
then be used for the estimation of the parameters a and p, in a form the same as that given for 
“ logits” in Section 5 except for a difference in the weighting coefficient. Any other sigmoid 
curve might be considered instead of equation (36), such as, on the analogy of probits, 

a -f Pa: 

1 / - // J ( 38 ) 

— 00 


but for many quantitative biological data perhaps the logistic is the most reasonable to try. When 
H is not known in advance and has itself to be estimated from the data, the computations become 
more troublesome. Finney (1947) has shown how the standard form of calculations for the 
probit transformation can be generalized for the fitting of equation (38), and his method can be 
modified so as to suit equation (36) or any other of the transformations considered in Section 5. 
The process is scarcely likely to prove popular for routine assays if by a suitable choice of experi- 
mental conditions the necessity for it can be avoided. 

In a study of dose-time-mortality relationships, Ipsen (1941) has suggested simplifying certain 
response curves by using an “eqyivalent dose” instead of time. His idea might be adapted for 
use with assays in which no obvious choice of dose and response metameters reduces the response 
curve to the form of equation (11). So far as is known to the writer, this approach has never 
yet been used, and a number of theoretical objections to it could be raised ; nevertheless it might 
prove helpful in the assay of a group of materials showing an otherwise intractable form of response 
curve. From tests at a number of different doses, the relationship between V and the logarithm 
of the dose, Xy might be determined empirically for the standard preparation, the curve being fitted 
either by eye or as a polynomial regression on x: 

U^Ax), 


(39) 



58 


(No. 1, 


Finney — Principles of Biological Assay 

Providing that Condition I is satisfied by this function, in subsequent tests for assay purposes 
a response nnetameter may be defined by the inverse function 

Y r: f'HU), 

which is read either directly from the empirical curve or from a tabulation of its values. If the 
potency of the standard preparation were exactly as in the preliminary tests, the regression equations 
for the two preparations would be 


Vt log 9 f 

even if conditions have changed to some extent since the preliminary tests, as long as Condition II 
holds equations of the form of (13), (14.2) are likely to fit the data tolerably well. Hence p may 
bp estimated graphically as before or, if the empirical function (39) seems sufficiently reliable to 
justify the trouble, /'(F) may be found from it and the maximum likelihood process continued 
in the standard manner. Alternatively, if the response curve seemed simpler with as a dose 
metameter, a similar course could be followed with an empirical form of equation (39) fitted to 
this metameter. If the potency of the standard remained unaltered in subsequent assays, the 
regression equation.; would then be 


1 

Vi - P-v, j 


(41) 


and in any case equations (13), (14.1) could probably be fitted to the data satisfactorily provided 
that Condition II held. This method, however, is likely to be of limited applicability and will 
not be considered further here. 


5. Quantal Responses 

Many important assays are dep)endent upon a quantal or “all-or-nothing” response. The 
typical response of this kind is death of the test subject, but there are others, such as spore ger- 
mination in fungicidal assays, or cure of deficiency symptoms in vitamin assays, for which no 
graduation of the response is possible and all that can be done is to describe a lest subject as 
responding or not responding to a dose z. The role of u in Section 2 is now played by Py the 
proportion of test subjects responding in a sample receiving a dose 7 . The expectation of p will 
be a function 


P= F{z) (42) 

which will usually satisfy Condition I, and for which, over the range of doses to be used, 

0 - F{z) : 1. 

For convenience of nomenclature, the response will be considered to be death of the subject. 

Unless there is a natural mortality amongst undosed subjects (Finney, 1944), F(0) must be zero, 
and unless some of the subjects arc immune to the material under test F{z) must become unity at 
a sufficiently high value of 7 . These possible complications will be considered later. If the 
reasonable assumption that any subject killed by a dose 7 would, under the same circumstances, 
have been killed by any dose greater than 7 (and consequently that any subject surviving a dose 7 
would have survived any dose less than 7 ) be accepted, then with each individual in the population 
from which the test subject was drawn can be associated a threshold value of 7 , the tolerance or 
maximum dose which would just fail to kill the individual under the circumstances of the test. 
This concept is entirely different from that of the “minimal lethal dose” at one time popular with 
toxicologists. The idea of measuring potency in terms of the least dose potentially fatal to all 
test subjects has been abandoned, as a result of the pioneer thinking of Trevan (1927) and Gaddum 
(1933), who showed it to be inconsistent with the known occurrence of wide variation in suscepti- 
bility from subject to subject; instead a frequency distribution of individual tolerances must be 
envisaged. 



1947 ] 


Finney — Principles of Biological Assay 


59 


If n subjects are each given a dose 2 , to which they react independently, the probability that r 
will die and (/i-r) survive is 

nCr 

where Q — 1 ~ P. Hence the probability of obtaining ra, rs . . . deaths in batches of 
Wi* « 2 , Ws . . . receiving doses Zi, Za, Za . . . is proportional to where 

L - log P + S{n-r) log Q. . ... (43) 


Application of the principle of maximum likelihood shows that if 0 is a typical unknown parameter 
of the function P( ), estimates of 0 and other parameters will be obtained from the simultaneous 
solution of equations of the form 


dL^ ^n(p-P) dP 
de PQ do 


(44) 


where p — rjn is the estimate of P for a dose z. Now equation (44) is very similar to the maximum 
likelihood equation derived from equation (15), except that the binomial variance, PQ/«, has 
replaced o*. The argument of Section 3 may be repeated to show that, when P can be expressed 
in the form 

P-/(a + 13jc), (45) 


the estimation of a and p may be carried out with the aid of weighted linear regression compu- 
tations: a Y,hx are first approximations to the maximum likelihood estimates, improved estimates 
are obtained from the weighted regression of a working response on x. As in equation (20), 
working responses are derived from the first approximations and arc 


Y^V{p~P)inY\ ) 
or Y-{q QWXY); ] 


. (46) 


the weight to be attached to a working response based on n subjects is nw where 

H- {f\Y)y IPQ (47) 

Again the first approximation may be obtained by plotting the metameters of the empirical values 
of p against jr and fitting a straight line to these by eye. If the same response transformation has 
to be used often, a table of T as a function of P, and tables of the weighting coefficient, the minimum 
working response, and the range as functions of Y will be helpful. The minimum working 
response is now a true minimum, being the value of y to be taken when the observed kill is zero 
(p — 0); the range is the difference between this and the maximum working response for the same 
y, the working response when all the subjects are killed (q = 0). Quantal responses may be used 
for assay purposes just as easily as quantitative responses. The necessary extension of the above 
procedure is exactly that described in the last paragraph of Section 3 ; an example is given below. 

The frequency distribution of individual tolerances in terms of the dose, - 3 , is easily derived 
from equation (42) as 

dP‘^F\z)dz (48) 


or, in terms of the dose metamqter for data in agreement with equation (45), 

dP P/'(a -f ^x)dx (49) 


The most useful distribution of individual tolerances to study is that which gives a normal distri- 
bution of the corresponding dose metameters. If equation (49) takes the form 

dP ^ e 2^'^ 

nV2n dx, .... (50) 

the appropriate response metameter is defined by 





60 


Finney — Principles of Biological Assay 


[No. 1, 


and equation (13) becomes 



Gaddum (1933) termed Y in this transformation the normal equivalent deviate of P; Bliss (1934<i,^) 
introduced the name probit for the same quantity increased by 5 in order to avoid negative values. 
The probit transformation has proved satisfactoiy for simplifying the expression of the dose- 
response relationship for many types of quantal response (Bliss and Cattell, 1943); Bliss (1935a, b)^ 
Irwin (1937), and others have described the use of probits in biological assay. No example of the 
probit method will be given, as its use has been so frequently illustrated elsewhere, and the procedure 
is exactly parallel to that shortly to be described for “logits.” 

There is no limit to the number of alternative functions that could theoretically be taken for 
/( ), though few of them are likely to be of much practical importance. Urban (1909, 1910), who 
followed Fechner (1860) in using the essentials of the probit transformation for psychometric 
investigations many years before the development of its biological applications, suggested what in 
the present notation may be written as 

P = i + ' tan ‘ K ( - 00 ; y< 00 ). 

* 7Z 

Wilson and Worcester (1943a, b, c) established equations relating to the use of quantal responses 
in biological assay, though not in quite so general a form as the equations given here, and discussed 
several alternative functions. In addition to the probit transformation they considered 

P = i (1 -f- tanh Y) (- co < y < oo ) 

P- id f Y) (- 1 . 1) 

- id -1 sill y) (- 2 -.. y- 2 ^ 

'’■‘("vi + i-) 

The first of these uses a logistic curve instead of the sigmoid obtained as the cumulative normal 
curve. The second is merely a linear function of P, which might have been chosen, less symmetri- 
cally but more simply, as 

P - y (0 - y . 1 ), 

and the third is effectively the same as 

P - sin* y, (0 < y < 1 ), 

a transformation introduced by Bliss for use in the analysis of variance of percentage data. 

From equations (46) and (47), the various formulae required for use with these different 
response metameters can be obtained, and are as follows : — 




1947] 


Finney — Principles of Biological Assay 


61 


In addition to their tables for the first of these, the probit transformation, Fisher and Yates (1947) 
have given tables for transformation No. 5, commonly known as the angular transformation, 
which can be used in the same manner, and have indicated how other tables in their collection 
can be used for transformation No. 3. “Transformation” No. 4 presents no difficulties; the 
results above show that if P itself is linearly related to the dose mctameter, the empirical values p 
are to be used as working responses and weighted inversely as the binomial variance estimated 
from the provisional line (of course such a relationship could scarcely occur except as an approxi- 
mation, excluding extreme values of F). For the six forms of the function / ( ) at present under 
examination, equation (49) gives the frequency distributions of individual tolerances as 


dP^ = 





dx^ 


t/P, = ^ I 1 + (a + ‘ dx, 

dP^ “ sech* (a 4- pjc) dx^ 

dP^ = P dx, 

dPi =-= p sin 2(a + pjc) dx. 


dP, - 



1 4- (a H- 



dx. 


Berkson (1944) advocated using the logistic curve instead of the normal sigmoid in the analysis 
of data from quantal assays; he argued, “However, the logistic function is very close to the inte- 
grated normal curve, it applies to a wide range of physico-chemical phenomena and therefore may 
have a better theoretic basis than the integrated normal curve.” This statement seems to be 
based on a misconception, since the reason for introducing the transformation specified by 
equation (51) is not directly the shape of the normal sigmoid, but the derivation of the transfor- 
mation from a normal distribution of tolerance as measured on the jc-scalc. Though absolute 
normality of distribution may seldom be attained, there seems little cause in general to prefer a 
distribution curve whose ordinate is proportional to sech* (a + p^r). 

Extensive tables now available (Bliss, 1935a, 1939; Fisher and Yates, 1947; Finney, 1947) 
make use of the probit transformation a simple computational routine. De Beer (1945) has made 
ingenious suggestions for improving the graphical method with the aid of specially graduated 
rules ; his scheme does not take full account of the finer poinis of the maximum likelihood solution, 
since it only employs working probits for zero and 100 per cent, kills, but it should be good enough 
for most practical purposes and might prove much quicker than the full computations. 

Wilson and Worcester (1943a) have given tables to simplify the computations for the logistic 
transformation, providing that there are only three doses, with equal numbers of subjects at each. 
Berkson (1944) has developed a computational procedure for the logistic transformation, which 
fails to take account of the asymmetry of binomial frequency distributions by the introduction of 
“working responses” or otherwise, though he draws attention to this possibility, and thus does 
not lead to the maximum likelihood solution. He says, “I believe that the work of fitting the 
logistic as given here is considerably simpler than that of fitting the normal curve by probits and 
maximum likelihood as advocated by Bliss and Fisher.” This greater simplicity is entirely a 
result of the omission of the calculation of working responses; with the aid of Tables 3 and 4 
below, these may be introduced into the logistic calculations and the two transformations are 
then exactly equal in respect of the amount of computing they require. 

In practice, the alternative transformations given above, as well as others such as the rankit 
(Ipsen and Jerne, 1944), would often give very similar results, and many scries of data would fail 
to distinguish one of them as peculiarly appropriate; even the rectangular tolerance distribution, 
corresponding with 

F -- a 4- ^x. 


though objectionable on theoretical grounds, may not be too bad a fit unless the data either involve 
extreme values of F or are very precise. Unless the data clearly indicate the need for something- 
different, the probit seems the obvious choice for use in assays. Nevertheless, that based on the 



62 


Finney — Principles of Biological Assay 


logistic curve is not without interest, for it implies a linear regression of log (P/Q) on log z and 
may therefore be of value in connection with the Langmuir adsorption law (Bliss, 1935^?). Adapting 
Berkson’s terminology, but, as with probiis, taking steps to avoid the occurrence of negative 
values, the logit of P may be defined to be K in the equation 

or K 5 + tanh-i ( 2 />- i) (53) 


In Table 3, Y is shown as a function of P, The expressions for minimum and maximum working 
probit, range, and weighting coefficient have been given above (transformation No. 3), and 
numerical values at intervals of 0*1 in T arc given in Table 4. 

As both the probit and the logit transformations require that jP — 0 should correspond to an 
infinite negative value of jc, the only simple dose metameter of interest is the logarithm of the dose. 
The arithmetical procedure for assaying a test material against a standard is then very similar to 
that described in the early part of Section 4. The fitted regression lines are constrained to be 
parallel in each cycle of the computations (one cycle is often sufficient) and the lines are found in 
the form of equations (22). The estimate of relative potency is then obtained from equation (23) 
and its variance from equation (24), with the difference that A#, Nt, instead of being the total 
numbers of subjects for the two preparations, are now the totals of the weights, SnWy where n is 
the number in a batch for which the weighting ct>cfficient is w\ fiducial limits are obtained from the 
expression (27). The weights, being derived from binomial frequency distributions, are equal to 
(instead of merely proportional to) the reciprocals of the variances of the response metameters, and 
therefore the residual sum of squares may be referred to the X® distribution as a test of agreement 
between the results and hypothesis. Providing that there is no serious evidence of discrepancy, 
the variance per unit weight, may be taken as unity and fiducial limits calculated by use of 
normal instead of /-deviates. A significant value of X^ may indicate that the dose-response 
regression function is of a different form from that used, thus leading to non-linearity in the 
(a*, ;') plane, or it may indicate that through unavoidable, or perhaps avoidable, flaws in experi- 
mental technique the several batches of test subjects are heterogeneous. If the second explanation 
seems appropriate, an adjustment may be made by writing 


js* - X^/fdegrees of freedom) 


instead of unity, and reverting to the use of /-deviates. 


Table 3 

Transformation of Percentages to Logits 

6 


Percentage 0 

1 

2 

3 

4 

0 

. 

2-702 

3-054 

3-262 

3-411 

10 

3-901 

3-955 

4-004 

4-050 

4-092 

20 

• 4-307 

4-338 

4-367 

4-396 

4-424 

30 

- 4-576 

4-600 

4-623 

4-646 

4-668 

40 

- 4-797 

4-818 

4 839 

4-859 

4-879 

50 

• 5-000 

5-020 

5-040 

5-060 

5-080 

60 

• 5-203 

5-224 

5-245 

5-266 

5-288 

70 

• 5-424 

5-448 

5-472 

5-497 

5-523 

80 

- 5-693 

5-725 

5-758 

5-793 

5-829 

90 

• 6-099 

6-157 

6 221 

6-293 

6-376 


00 

01 

0-2 

0-3 

0-4 

97 

- 6-738 

6-756 

6-774 

6-792 

6-812 

98 

- 6 946 

6-972 

7-000 

7-029 

7-060 

99 

• 7-298 

7-351 

7-410 

7-477 

7-555 


3-528 

3-624 

3-707 

3-779 

3-843 

4-133 

4-171 

'4-207 

4-242 

4-275 

4-451 

4-477 

4-503 

4-528 

4-552 

4-690 

4-712 

4-734 

4-755 

4-776 

4-900 

4-920 

4-940 

4-960 

4-980 

5-100 

5-121 

5-141 

5-161 

5-182 

5-310 

5-332 

5-354 

5-377 

5-400 

5-549 

5-576 

5-604 

5-633 

5-662 

5-867 

5-908 

5-950 

5-996 

6-045 

6-472 

6-589 

6-738 

6-946 

7-298 

0*5 

0-6 

0-7 

0-8 

0*9 

6-832 

6-853 

6-874 

6-897 

6-921 

7-092 

7-127 

7-165 

7-205 

7-249 

7-647 

7-759 

7-903 

8-106 

8-453 



Maximum and Minimum Working JLogits^ Rcuige^ and Weighting Coefficient 

Minimum working logit. 


Sxt^ted 

lo^lt 


1 

2PQ 


Expected 

lojit 

W eiKtll lllK 
coeRictent 
w 

1-1 

0-5998 

1221-3 

9-4002 

8*9 

*00164 

1-2 

0-6997 

1000-1 

9-3003 

8-8 

00200 

1-3 

0-7997 

819-0 

9-2003 

8-7 

*00244 

1 *4 

0-8996 

670-7 

9 1004 

8-6 

-00298 

1*5 

0-9995 

549-3 

9-0005 

8-5 

-00364 

1-6 

1 -0994 

449-92 

8-9006 

8-4 

•00445 

1-7 

1-1993 

368*55 

8-8007 

8-3 

-00543 

1 -8 

1 -2992 

301 *92 

8-7008 

8-2 

-00662 

1 -9 

1 - 3990 

247 - 38 

8 6010 

8-1 

-00808 

2-0 

1-4988 

202-72 

8-5012 

8-0 

-00987 

2-1 

1 - 5985 

166-15 

8-4015 

7-9 

-01204 

2-2 

1-6982 

136-22 

8*3018 

7*8 

•01468 

2*3 

1 -7977 

111-71 

8-2023 

7*7 

*01790 

2-4 

1-8972 

91 -64 

8-1028 

7-6 

•02182 

2-5 

1 -9966 

75-21 

8 0034 

7-5 

•02659 

2-6 

2-0959 

61-759 

7-9041 

7*4 

•03238 

2-7 

2-1950 

50-747 

7-8050 

7-3 

•03941 

2-8 

2-2939 

41 -732 

7-7061 

7-2 

•04793 

2-9 

2-3925 

34-351 

7-6075 

7-1 

05822 


2-4908 

28 • 308 

7-5092 

7-0 

07065 

31 

2*5888 

23 ■ 362 

7-4112 

6 9 

-08561 

3-2 

2-6863 

19-313 

7-3137 

6-8 

10356 

3-3 

2-7833 

15-999 

7-2167 

6-7 

•12501 

3*4 

2-8796 

13-287 

7-1204 

6-6 

• 1 5053 

3-5 

2-9751 

11 -068 

7-0249 

6-5 

• 18071 

3-6 

3-0696 

9*2527 

6-9304 

6-4 

•21615 

3-7 

3-1629 

7-7690 

6-8371 

6-3 

• 25743 

3*8 

3-2546 

6 - 5569 

6 - 7454 

6-2 

•30502 

3-9 

3-3446 

5 - 5670 

6-6554 

6-1 

• 35920 

40 

3-4323 

4-7622 

6-5677 

6-0 

•41997 

4-1 

3-5174 

4-1075 

6-4826 

5-9 

•48692 

4*2 

3-5991 

3-5775 

6-4009 

5*8 

• 55906 

4-3 

3-6767 

3 1509 

6-3233 

5-7 

- 63474 

4-4 

3-7494 

2-8107 

6-2506 

5-6 

•71158 

4-5 

3-8161 

2-5431 

6-1839 

5-5 

•78645 

4-6 

3-8753 

2-3374 

6-1247 

5-4 

• 85564 

4-7 

3-9256 

2-1855 

6 0744 

5-3 

•91514 

4-8 

3-9648 

2-0811 

6-0352 

5-2 

•96104 

4-9 

3 9906 

2-0201 

6-0094 

5-1 

•99007 

5-0 

4-0000 

2-0000 

6-0000 

5-0 

1 -00000 

5-1 

3-9893 

2 0201 

6 0107 

4*9 

99007 

5-2 

3-9541 

2-0811 

6 0459 

4-8 

•96104 

5-3 

3-8889 

2-1855 

6-1111 

4*7 

•91514 

5-4 

3-7873 

2-3374 

6-2127 

4-6 

•85564 

5-5 

3-6408 

2-5431 

6-3592 

4-5 

-78645 

5-6 

3-4399 

2-8107 

6-5601 

4-4 

•71158 

5*7 

3-1724 

3-1509 

6-8276 

4-3 

•63474 

5-8 

2-8234 

3-5775 

7-1766 

4-2 

•55906 

5-9 

2-3751 

4-1075 

7-6249 

4-1 

•48692 


1 -8055 

4-7622 

8-1945 

4-0 

•41997 

6- 1 

1-0875 

5 - 5679 

8-9125 

3-9 

• 35920 

6-2 

0- 1885 

6-5569 

9-8115 

3-8 

•30502 

6-3 

-0-9319 

7-7690 

10 9319 

3-7 

-25743 

6-4 

—2-3223 

9 - 2527 

12-3223 

3-6 

•21615 

6-5 

— 4-0428 

11 -0677 

14-0428 

3-5 

•18071 



^ FwmY— Principles of Biological Assay 

The working logit, y, may be obtained as 


y ~ y 


2Q ^ 2PQ 
1 




2P^2PQ* is the more convenient. 


where p i \ ~q) is the observed proportion killed. 

Example: Miller et al. (1939) have given the results of a comparison of the toxicity of digitalis 
o Irogs when injected in two different ways— intramuscularly and into the ventral lymph sac. The 
fSnf estimation of the potency of intramuscular 

^ K 'r” injection. The do.se levels are obviously too few to provide any 

S wi^te Sn from T^hle 7^“^^ transformation is as good as or better than the probit; indeed. 

The «^mnTJf.c If!; i-esults are obtainable without any transformation 

S L not justify the introduction of the complications and refinements 

of thp cvtsiom ♦* echnique. The full calculations are given, however, as an illustration 

suml^^n *--f--tion or to any of the 

^ logarithm of dose (in c.c. of digitalis pre- 

from wh.V+ fh^' ^ ‘he number of frogs tested at that dose, and the number kified 

mortalities in column 4 are calculated. Empirical Sscorr^ 
spondmg to these proportions are read from Table 3 and graphed against ^ in Fig. 2. Two pScl 




Table 5 


Preliminary Logit Computations for Comparison of Potency of Two Methods of Injection of 

Digitalis in Frogs 





(Data 

L of Miller et al) 




Log Number 

Number Proportion 







dose offroga 

killed killed 

Empirical 






X 

n 

r p 

logit 

r 

nw 

y 

nwx 

nwy 

Lymph sac : 









•75 

15 

2 ‘ *13 

4*05 

4*0 

6*3 

4*05 

4*725 

25-515 

•85 

15 

5 *33 

4*65 

4*6 

12*8 

4*65 

10*880 

59*520 

•95 

15 

8 *53 

5*06 

5*1 

14*9 

5*06 

14*155 

75-394 






34-0 


29*760 

160*429 

Intramuscular : 








•45 

15 

2 *13 

4*05 

4*1 

7*3 

4*05 

3*285 

29*565 

•55 

15 

5 *33 

4*65 

4*7 

13*7 

4*65 

7*535 

63*705 

•65 

15 

10 *67 

5*35 

5*3 

13*7 

5*35 

8*905 

73*295 






34*7 


19*725 

166*565 




- 0-8753 

A 4*7185 





Xt 

- 0*5684 

jj, = 4-8001 





Snwx^ 


Snwxy 


Snwy* 




Lymph sac: 









26*23900 


141*3526 


761*597 





26*04875 


140*4226 


756*984 





0*19025 


0*9300 


4*613 




Intramuscular : 









11*41075 


95*9838 


808*095 





11*21255 


94*6828 


799*536 





0*19820 


1*3010 

• 

8*559 







Table 6 







Computations for Relative Potency on Data of Table 5 





Sxx 




^vv 


Lymph sac 


0*19025 


0*9300 


4*613 


0*067 

Intramuscular. 

0*19820 


1-3010 


8*559 


0019 

Total 


0*38845 


2*2310 


13*172 


0*359 


Heterogeneity: */* ( 2 ] ^ 0 067 + 0019 0*086 

Parallelism: x® \i\ ^ 0*359 - 0*086 - 0*273 

b -- 2*23 10/0*38845 - 5*7433 

Regression equations log LD50 

--- -0*309 + 5 743JC 0*924 

y* - 1*536 4- 5*743 jc 0*603 

log R - ,0*321 ±0*042 

t'dog /?) [0 02941 + 0 02882 + 

- (0 0422)» 

g == 3 •84/0-388456“ = 0-300 

Fiducial limits - 0*321 + ± ^ V'0'7 x 0*05823 + 0 00050 

- 0*327 ± 0*099 

- 0*426, 0*228 

SUPP. VOL. IX. NO. 1 . F 



66 Finney — Principles of Biological Assay [No. 1, 

lines are then fitted by eye to the points of this diagram and the expected F corresponding to each x 
is tabulated. The values of Y are used in conjunction with Table 4 to give the weights, nw, and 
the working logits, y. Since w is 15 for every dose, the multipli^tion by 15 could have been 
deferred until later, but the more general pattern of computations is shown here; the provisional 
lines in this example fit the data so closely that the working logits are the same as the empirical. 
Sums of squares and products are obtained as in the remainder of Table 5 ; the numbers of decimal 
places shown at all stages are sufficient for logit and probit computations unless the numbers of 
test subjects are much greater than here. 

Table 6 shows the remaining stages in the computation of the estimate of log R. The two sets 
of sums of squares and products are added and tests of heterogeneity and of parallelism are 
made. The heterogeneity test is identical with a test on the observed and expected numbers 
of dead and survivors for each dose. Consequently, when one or more small expiectations occur 
the X** may be unduly large; this point has been discussed more fully elsewhere (Finney, 1947), 
but causes no alarm here since the calculated X'-* is so small. The average value of the regression 
coefficient is next found, and is used to give the two regression lines and estimates of log median 
lethal doses. Values of Y calculated from these lines arc the same as those already used in Table 5, 
so that no further cycle of computations is needed. The estimate of log R is the difference between 
the two log median lethal doses, and its variance is found from equation (24), putting 34*0, 
V/ 34*7; the fiducial limits are similarly found from (27). The conclusion to be drawn is that 
intramuscular injection has shown no significant difference from lymph-sac injection in the form 
of the response curve, except that the potency has been increased to 209 per cent, of the lymph 
sac value; the 5 per cent, fiducial limits of this ratio are 267 per cent, and 169 per cent. Table 7 
shows comparative results for four different response metameters; even the fourth of these, 
implying a linear relationship of P and log dose, here gives results practically the same as the others 
since the data are so very regular. 


Table 7 

Comparison of Four Response Metameters for the Analysis of the Assay in Table 5 


Lou urn 


KeHpOIIHC 

Lymph sm- 

Iiitrainuseiilar 

Lotf R 

R 

5 per ooiit.. fhliicia) 
llinlt-H for R 

Probit 

0-924 

0-603 

0-321 

2-09 

1-70-2-65 

Logit 

0-924 

0-603 

0-321 

2-09 

1-69-2-67 

P - sin*F 

0-922 

0-603 

0-319 

2-08 

1-70-2-62 

P Y 

0-919 

0-605 

0-314 

2-06 

1-72-2-53 

In many forms of qiiantal assay, allowance has to be made for a 

“natural 

mortality” amongst 


the test subject.s, or the equivalent phenomenon for responses other than death. Thus in insecti- 
cide trials a proportion of insects would probably die in the period between testing and classifying 
of subjects even though they received no poison, and in assays of fungicides by spore germination 
counts a proportion of untreated spores would fail to germinate. Providing that this natural 
mortality occurs randomly and is not correlated with susceptibility to the poison, the proportion 
of subjects whose death is due to the treatment, P, will be related to P\ the total proportion killed 
at any dose, by the equation 

F - C I- P(1 -C), (54) 

where C is the natural mortality rate. If not all subjects are susceptible to the materials under 
test, a proportion K being immune whatever the dose given, then 

P' C I P{{ - K - C) (55) 

In reality, natural mortality may be greatest amongst the subjects of lowest tolerance, so that P 
derived in this .manner will not truly represent the effect of the poison alone acting on the whole 
population, but equation (54) or the more general equation (55) may be taken as a first approxi- 
mation to the truth. 



1947] 


Finney — Principles of Biological Assay 


67 


If C and K were known for the population from which the subjects were selected, values of p 
could be obtained from the observed mortalities p' by means of equation (55). The only alteration 
then required in the assay procedure already described is that equation (47) must be replaced by 


w 


P + 


mm 




. (56) 


which change, even for small values of C and K, will reduce w considerably when F or Q is near 
zero. For the probit transformation, Finney (1944, 1947) has tabulated this function with K ~ 0, 
the most important case. In practice C and JC have usually to be estimated from experimental 
data, a situation which greatly complicates the maximum likelihood equations. Finney has 
shown a practicable computational procedure of successive approximations for K 0, and this 
can easily be extended for use when K ^ apart from the additional labour, there are no 
essentially new features in the analysis. 


6. Bacterial Assays from Dilution Series 

A slightly different type of quantal assay that may formally be analysed by the general 
method of this paper is the estimation of a bacterial population from a dilution series. A series 
of dilutions (usually in geometrical progression) of a bacterial suspension is prepared and several 
plates are inoculated from each dilution; each plate is later classified as positive or negative 
according to whether or not growth occurs, but no colony count is made. From this dichotomous 
classification the density of bacteria in the original suspension is to be estimated. On the assump- 
tions that the original suspension was sufficiently well mixed to give a Poisson distribution of the 
numbers of bacteria in equal units of volume, and that presence of a single bacterium on a plate 
will cause growth, the probability of a negative plate at a dilution of x parts of suspension per 
unit volume is 

P = (57) 


where p is the density of bacteria in the original suspension reckoned as the number in a volume 
equal to that used for the unit inoculum. 

Fisher (1922) has shown that, when equal numbers of plates are tested at each of a series of 
dilutions with values of jc in geometrical progression (a convenient and popular practical technique), 
an estimate of fi with 88 per cent, efficiency can be made by equating the total number of sterile 
plates to its expectation, provided only that the range of dilutions tested is sufficiently extensive 
to give a very low probability of a sterile plate at one end and a very low probability of a fertile 
plate at the other. Fisher and Yates (1947, Table VI 11 2) have given a table to assist rapid esti- 
mation by this method. So high an efficiency is satisfactory for most purposes, and solution of 
the maximum likelihood equations given by Fisher is seldom necessary unless either the scries 
of dilutions is short, or the values of x are not in geometrical progression. 

Halvorson and Ziegler (1933) and Swaroop (1938) have given tables of solutions of the 
maximum likelihood equation for [i when equal numbers of plates have been inoculated at each 
of three dilutions in a ten-fold series. In general, the solution can be performed by the method 
of Section 5 : by writing 

(58) 

equation (57) is transformed into . 

Y=^ [ix (59) 

For this transformation 


nr) - -P 


and therefore, from equations (46) and (47), if Y is determined from a first approximation to p, 
a new approximation may be obtained as the weighted linear regression coefficient of the working 
response 


y = y + 1 — pe^ 


. (60) 



68 Finney — Principles of Biological Assay 

on ; /? is the proportion of sterile plates observed at dilution jc, and 


[No. 1, 


. (61) 


is the weighting coefficient per plate tested at this dilution. If only one plate is set up for each 
dilution, p is necessarily always 0 or 1 ; the procedure can still be followed, though the first approxi- 
mation may not be very satisfactory. The fitted regression line, 


must be constrained to pass through the origin. 

The range of dilutions will usually be sufficient to give expected probabilities of sterile plates 
at one end very close to zero and at the other very nearly unity ; consequently Y will run from very 
large values to very small, w from very small to very large. Retention of the right numbers of 
digits in the computations is often made easier by writing 


r - Ylx. . . 

y' - T' + ' - , 


. (63) 
. (64) 


- 1 ’ 


. (65) 


Here this function is readily seen to have 


h is then the weighted mean value of y\ using weights //w'. This modification is particularly 
easy when the dilutions differ by powers of 10. 

Example: Halvorson and Ziegler (1933) discuss the example shown in Table 8, relating to a 
test with 160 plates at each of six dilutions in a ten-fold series. For convenience, the scale of x is 
chosen to give unit dilution somewhere in the middle of the range. The numbers of sterile plates 
are shown in the column headed r, and the proportion sterile is shown as p. A first approximation 
to the estimate, 6, of the mean number of organisms per unit of inoculum at the dilution .v = 1 is 

the average of the empirical values of r\ - ^ log^ p. Flere this function is readily seen to have 

the greatest weight at x - 10 and at ^ ^ I, much less at a: = 10 and a negligible weight 
elsewhere ; hence a first approximation, Y\ is taken as a little less than the empirical value for 
X 1. 

Table 8 

Estimation of Populations of Bacteria by Dilution Method 
(Data of Halvorson and Ziegle ) 

^ " r /’ I l<>K P 1" ,/' 

10“ 160 0 0 <X) -47 -471 0 0 

10* 160 0 0 X .47 -480 0 0 

10 160 2 012 -442 47 -438 147 64-386 

1 160 99 -619 -480 -47 -480 267 128-160 

10' 160 152 -950 -513 -47 -513 33 16-9» 

10 “ 160 160 I (XX) 0 -47 — (X)l 3 — 0-(X)3 


or 


r 

p 

^ loK P 

10-* 

160 

0 

0 

00 

10* 

160 

0 

0 

X 

10 

160 

2 

•012 

•442 

1 

160 

99 

•619 

•480 

10 * 

160 

152 

•950 

•513 

10 * 

160 

160 

I 000 

0 


Snw'y' 

98-402 

97-508 


450 209-472 


0»i'4 = x’i3] 

b = 209-472/450 
== 0-4655 ± 0-0471, 



1947] Finney — Principles of Biological Assay 69 

With the aid of a table of exponentials, / and nw' were calculated from equations (64) and 
(65) for Y' — 0-47 ; three, or at most four, significant digits in y' and three in nw' are quite 
sufficient since the values of p have been determined from less than 200 plates. The weighted 
sum of squares of deviations of y' may be taken as a x*» here with 3 degrees of freedom since 
only four levels have really been used in the analysis. This test of the agreement between the 
observations and the estimated population density is the same as could be calculated from observed 
and expected numbers of sterile and fertile plates. The estimated number of bacteria per unit 
of inoculum at the dilution jc = 1 is 

h Snw'y'ISnw\ 

and its variance is l/Snw'; the numerical result from Table 8 is 

b - 04655 ± 0 0471, 

and the close agreement with the first approximation shows there to be no necessity for a further 
cycle of computations. Halvorson and Ziegler, using only the three most informative dilutions, 
obtained an estimate 0 4669; their estimation equation is the same as that used here, but they 
do not describe any successive approximation process for its solution. Estimation from the total 
number of sterile plates, by means of Fisher and Yates’ table, gives 0 472 i 0 047, in close agree- 
ment with the maximum likelihood figure. For so large a number of plates, the standard error 
can probably be used safely for assigning fiducial limits to the population density, but had there 
been only a few plates at each dilution an uncritical calculation of limits might be misleading. 

In most practical applications of the dilution series technique, the small loss of efficiency 
involved in Fisher’s method of estimation is outbalanced by the great simplicity of the calculations. 
If the number of plates is not the same for every dilution, or if the series of dilutions is other than 
the two-, three-, and ten-fold series for which Fisher and Yates have given tables, the method 
outlined here provides a moderately speedy way of obtaining the maximum likelihood estimate. 
The functions needed in the solution can easily be derived from tables of exponentials and natural 
logarithms: the excellent tables from the New York Works Project Administration are very 
suitable. Swaroop (1938) has tabulated (e** — 1)"^, (^o ojn _ j)-i^ and „ j) -i fo,. 

integral values of n from 1 to 1,000, and his table may^ be used for many of the values of w' required 
in ten-fold series. Barkworth and Irwin (1938) have also shown how the maximum likelihood 
estimate and its variance can be obtained by successive approximation, but their computations 
are less conveniently arranged than those in Table 8, 

7. The Design of Assays 

The chief purpose of every biological assay of the general type discussed in Sections 2 and 3 
is the estimation of a relative potency with sufficient precision for appropriate action to be based 
upon it. Usually the assay is also required to provide both an estimate of this precision and a 
test of the validity of the statistical principles involved. In a laboratory in which a particular 
form of assay is being used as a routine analytical method, a system of control charts may be 
preferable to the treatment of each assay as self-contained in these respects (Knudsen, 1945; 
Knudsen and Randall, 1945). In other circumstances, the normal experimental procedure of 
planning each assay as a unit yielding its own estimate of error and its own validity tests may be 
adopted. 

For some types of assay, the demands that for a given amount of experimentation and measure- 
ment both the most reliable estimate of p and a valid estimate of its variance shall be obtained are 
compatible. On the other hand, few assays can provide tests of validity unless their design is so 
modified as to be of le^ than optimal efficiency for the estimation of relative potency. Conse- 
quently, in a well-planned assay, the dose levels and numbers of test subjects at each dose must be 
chosen in relation to pre-existing knowledge of the potency of the material being assayed, of the 
validity of Conditions T and II, and of the appropriate dose and response transformations; if 
previous experience makes the experimenter certain of the validity of his statistical technique, he 
need not incorporate a test of validity in his design and can concentrate on reducing the variance 
of R, but otherwise, though he must not lose sight of his main objective, he must sacrifice some 
precision in R in order to make a sufficiently detailed test of validity. In planning for the precision 
of /?, previous information on the magnitude of p is often valuable. 



Finney — Principles of Biological Assay 


[No. 1, 


70 

The methods of analysis of assay data outlined in this paper 
of the relationship between response and dose by means of a straight line, which is estimated 
from the measurements either directly or after their transformation to suitable metaimeters. 
Departure from linearity does not of itself mean that an assay is invalid : it rnay be only ^ indication 
that the wrong metameters have been used. The chief requirement for validity is that Condition II 
shall be satisfied, or in other words that a single ‘‘relative potency” shall suffice for the evaluation 
of the test preparation in terms of the standard. Nevertheless non-linearity is at least a warnmg 
that the wrong analytical procedure has been adopted; unless there are very sound reasons for 
believing that the right meta meters for rectifying the dose-response relationship are known in 
advance, and that no serious deviations will occur, an assay should provide data for a test of 
linearity as well as of Condition TI. 

In the choice of dose levels, the chief problem confronting the experimenter is how to make the 
best use of a specified total, Ny of test subjects. Though a large number of levels may be incon- 
venient, the decision between, say, two doses of each preparation and five can often be based 
primarily on statistical considerations of which is likely to give the more valuable results. For 
simplicity of computation, the doses arc usually taken as equally spaced on the scale of that 
dose-metarnetcr which previous experience suggests as appropriate to the assay. Discussion of 
the principles governing the selection of dose levels in the several types of assay described earlier 
will make these points clearer. 

One method of improving the precision of an assay, which will not be discussed in any detail 
here, is by a covariance analysis of the results with one or more concomitant measurements cither 
made on the test subjects before the doses are given or known to be unaffected by the type of 
stimulus used. Adjustments for the initial weight of the test subject or the response it has shown 
in a pre-experimcntal period may substantially reduce the variance of /?, and should always be 
considered when suitable measurements arc either already available or can be made with little 
extra trouble. The calculations, however, present no novel features of special interest for the 
statistician; the analysis of variance of responses is expanded into analyses of variance and co- 
variance, and the responses are adjusted to equality of the concomitant measurements by means 
of a simple or multiple regression equation. The estimate of relative potency is formed from 
these adjusted responses in the usual way, though the calculation of its variance is complicated 
by the adjustments. 

(a) Logarithniic Dose Metametcr . — Equation (24) shows that when the logarithm of the dose 
is used as the metameter in an assay based upon quantitative responses, for any specified numbers 
of subjects used for the two preparations the variance of M will be least if 

M ~ Xn — xt; 

hence, so far as previous information on M permits, the doses should be chosen to give a mean 
dificrcncc equal to M. Unless the preparation to be assayed is very unusual, this condition can 
generally be satisfied sufficiently well to make the third term in equation (24) much smaller than 
the first two. The best use of a total of test subjects will then be to assign i A to each preparation. 
The linear relationship between x and the response will usually hold only over a limited range 
of doses; both the variance of b and the variance of M would be minimized by using 1 14N subjects 
at the highest and lowest doses of each preparation that could be employed without fear of going 
outside the limits of lineariiy, and the variance of the estimate would then be 

4.v^ r. . 2{M - xs + xtr 


ViM), 


[ 


1 -{- 


'■] 


( 66 ) 


Nlr^ L cl^ 4- cl{^ 

where dt are the intervals between the higher and lower values of x for the two preparations. 
Had the iA subjects for the two preparations been divided between kgy kt equally spaced dose 
levels respectively, evaluation of Sxx shows that the variance would have been 


V(M) 


4s^ 
Nb^ i 


I 4- 


6(M — if# 4 XtY 


( 67 ) 


which reduces to equation (66) when k, kt = 2, but otherwise takes a higher value. 



1947 ] 


Finney — Principles of Biological Assay 


71 


Purely from the point of view of obtaining the most precise estimate of relative potency, the 
popular “ four-point ” assay, involving equal numbers of test subjects at the highest and lowest 
doses of each preparation consistent with the hypothesis of a linear regression of y upon x, is the 
ideal. The validity of the analysis rests upon the assumption that the regression is linear and that 
the lines for the two preparations are parallel; Gridgeman (1943) and Wood (1944a) have indeed 
shown that if the true regression equations are quadratic the same estimate of M would be 
obtained from a four-point assay (for the special case d» ^ dt), but the estimate might be seriously 
misleading if the regression function were very different in form. The four-point assay provides 
data for a test of the significance of departures from parallelism, or, in other words, the significance 
of the discrepancy from Condition Tl, but gives no test of the adequacy of the metameters used 
for rectifying the dose-response relationship. One or more intermediate levels of dose must 
be included if a test of linearity is required ; three doses of each preparation will suffice to test 
the possibility that the relationship is simply concave or convex, but more will be needed if sigmoidal 
tendencies are feared. 

In this discussion it has been assumed that g^ as defined by equation (26), is small for all the 
designs considered. The spreading of a specified number of subjects over an increased number 
of doses will tend to lower the precision of b and hence to increase g. No examination of the 
effect of changes in and kt on the expression for fiducial limits, (27), has yet been made, but it is 
clear that an increase in g will widen the true limits to a greater extent than the approximate. 
Providing that g remains sufficiently small for this effect to be ignored, the loss of efficiency in the 
estimation of M through using k levels of dose instead of two can be assessed from comparison 
of equations (66) and (67). The simplest and most important case is that in which the extreme 
range of values of x is the same for both preparations (dg - dt), for even if better knowledge 
of the standard p)ermitted a wider range to be used, practical convenience would often suggest 
the contraction of this to dt. The upper limit to the percentage efficiency of an arrangement 
with k levels of each preparation, equally spaced logarithmically and with equal numbers of 
subjects at each, may then be written 


{M - xtYl 

j 

Ek ( M ) - . . . . (68) 

3(A- 1) {M-x,+x,Y 

' '■ k l- 'i ' 

Unless previous information on M is particularly poor, the mean values of x can usually be so 
chosen as to make {M - Xg -h xt) a small proportion of d, and the loss of efficiency with moderate 
values of k is then not serious ; Table 9 shows some values of the efficiency. 



Table 9 

Upper Limit to Percentage Efficiency of Symmetrical IV.- Point Assay Relative to Four-pointy 

Calculated from Equation (68) 

{g assumed to remain small) 

Values of {M — Xg Xt)/d 


00 

’ 01 

0*2 

0-3 

0.5 

10 

100 

100 

100 

100 

100 

100 

100 

100 

98 

96 

91 

80 

100 

99 

97 

94 

86 

71 

100 

99 

96 

92 

83 

67 

100 

99 

95 

89 

77 

58 

100 

98 

93 

86 

71 

50 


00 



72 Finney— Pmc/pto of Biological Assay [No. 1, 

In practice, assays in which (Af — Xg -f- xdid exceeded 0-3 would not be considered very 
satisfactory, and values of this quantity greater than 0-5 are scarcely ever encountered. Conse- 
quently the inclusion of even as many as five dose levels instead of two will cause at most a 15 per 
cent, reduction in the precision of A/, yet it will enable a thorough examination of departures 
from linearity to be made. If the only cause of curvature to be feared is a decrease in slope as 
the dose is increased, the most sensitive test will be achieved by the use of three dose levels for each 
preparation, taking the additional level half-way between the two extremes (on the logarithmic 
scale); one advantage of using a greater number, however, is that if the curvature is apparent 
only at the highest doses the results for these may be rejected and the remainder treated as showing 
a linear relationship. If the possibility of a sigmoidal curve has to be taken into account, four or 
five dose levels will be needed. The best number of levels to take for any particular assay must 
be decided from existing knowledge of the materials being assayed and of the type of deviation 
from linearity that may occur. Only if there is a practical certainty of linearity should the four- 
point assay be used, since the loss in precision resulting from the precaution of using more levels 
should be slight unless the change also causes a marked increase in g. 

The variance of M is also reduced by any procedure which increases the regression coefficient, A. 
This, however, is a characteristic of the test subject, the stimulus and the experimental 
conditions, though it may be altered advantageously by using a carefully selected and more 
sensitive strain of subject or by some slight modifications in other conditions. When the assay 
is of the type previously described as analytical, more radical changes in the technique may be 
permissible if they promise to give a greater slope, since the same quantity is still being estimated ; 
a comparison of the potency of two poisons, on the other hand, may give entirely different results 
for two different species of test subject, or under different experimental conditions, since, unless 
the one poison can be described as a dilution of the other in an inert diluent, there is no reason 
to suppose that the true value p is unaltered by the change. 

(b) Power Dose Metameter. -V^hen the dose metametcr is a power of the dose, and providing 
that the linearity extends down to zero dose, the mean response to zero dose should lie both on 
the standard and on the test regression line. In microbiological assays, at present the most 
important of this class, linearity down to a- - 0 can usually be ensured (Wood, 1946^/); the further 
simplification that the dose as measured is itself a satisfactory metameter (in equation (9), X — I) 
is also general, but does not seriously affect the problem of efficient design. 

The circumstance that one point can be found common to both lines leads to some economy 
in dose levels. Wood and Finney (1946) have shown that a common-zero three-point design, 
including zero dose and the highest levels of each preparation consistent with linearity, is very 
much more efficient than any four-point design using two non-zero levels of each preparation. 
The minimum variance of the estimated relative potency obtainable from a total of N tubes would 
occur with an assay having 


r-R R = (69) 

where the two preparations are measured on arbitrary scales which make the doses of each one 
unit, and «<,, n„ //< are the numbers of tubes in the blank, standard and test dose-groups respectively ; 
here and in the following paragraphs R is assumed to be less than or equal to unity, as would 
always be true in a satisfactory assay of this kind. Thus, if R were nearly unity, the best design 
would have practically no tubes at a: - 0 and about equal numbers at jr -= 1 for each preparation. 

But. just as for assays with a logarithmic dose metameter, additional dose levels are needed if 
the assay is to provide tests of validity and linearity. Wood and Finney have discussed this 
mattei at length, and their conclusions need only be summarized here. The simplest design 
giving information on departures from linearity is what Wood (1946a) has called the common-zero 
five-point design, consisting of single and double doses of each preparation as well as zero-dose ; 
the double dose of the standard preparation is chosen as the highest that can be used without risk 
of going beyond the limit of linearity, and the double dose of the test preparation is chosen as an 
amount expected to give a mean response close to, but not exceeding, that for the double dose of 
the standard. In these units R should be almost unity, and in a satisfactory assay the estimate 
should certainly lie between 10 and 0-7. Wood and Finney have given the complex expres- 



73 


1947] Finney — Principles of Biological Assay 

sion for the percentage efficiency of this design relative to that specified by equation (69), 
namely : 

. _ + 4/1,) (nt -f- 4/ii) -I- /i/w, {nt -h 4/f<) f nt'nt (/// -f 4//,) ^ 

N [iP + 4«,) - ■(«,' + 2/1, )» } - ij? r//,' + 2n.) Oh' + 2n,) ~ 

+ {N {It,' + 4n.) - (n,' + 2/;,)»}] . (70) 

where //^ is the number of tubes at zero dose, ///, n/ the numbers at the single doses of the two 
preparations, and Hf the numbers at the double doses. Again g is assumed to remain small 
throughout, a condition that will usually be satisfied without difficulty in microbiological assays. 
The simplest and most convenient arrangement is to take N/S tubes in each group, whereupon 
equation (70) reduces to 


E(R) = 350liSR* - 9R } 8), . . . * . (71) 

a function which increases from 50 at R ~ 1*0 to 62 at R 0-7. This efficiency seems poor by 
comparison with the values for ^ - 3 in Table 9, but, as may be seen from Table I of Wood and 
Finney's paper, no re-distribution of the tubes increases it appreciably without seriously reducing 
the sensitivity of the linearity test. Seldom, if ever, is there any advantage in distributing the tubes 
unequally between the standard and the test preparations, but occasionally the increase in the 
precision of R caused by assigning a greater proportion of the tubes to the highest doses at the 
expense of zero dose (say 10 per cent, of all tubes at x 0, 15-20 per cent, at jr 1, 30-25 per 
cent, at .v -- 2 for each preparation) more than compensates the experimenter for the less satis- 
factory linearity test. For most routine purposes the fully syrnmetrical five-point assay is likely 
to be suitable, but once again a decision must be based on previous knowledge of the potency 
and of the validity of the technique. 

The five-point assay is adequate for the detection of simple concavity or convexity of the 
response curve, but more points will be necessary in order to detect sigmoidal departures from 
linearity. The simplest and most important class of design is the fully symmetrical common-zero 
(2Ar -f- l)-point assay, in which Njilic + 1) tubes are assigned to zero dose, to the highest dose 
of each preparation, and to (/: - 1) equally spaced intermediate doses of each. Wood and 
Finney have given its percentage efficiency, when the results show satisfactory linearity up to 
the highest dose, as 


400(^ f 1)(A^ \ k f 1) 

3A [(/?=» + 1)(5A:^ 4- 5 k i 2) - 6Rk{k + 1)] 


(72) 


this being assessed relative to the arrangement specified by equation (69). Values of this function, 
which has a maximum at a point between R \ and /? = ? (depending upon the value of k\ 
are shown in Table 10. The reduction in efficiency with increasing k is more marked than in 


Table 10 

Upper Limit to Percentage Efficiency of Symmetrical Common-Zero (2k + IhPoint Design 
Relative to Asymmetrical Three-pointy Calculated from Equation (72) 

(g assumed to remain small) 


Values of R. 


k 

10 

0-9 

0*8 

0 7 

1 

67 

73 

79 

84 

2 

50 

55 

59 

62 

3 

44 

49 

52 

55 

4 

42 

46 

49 

51 

5 

40 

44 

47 

49 

10 ! 

37 

40 

43 

45 

00 

1 33 

37 

39 

41 



74 Finney — Principles of Biological Assay [No. 

Table 9, but even so the loss through taking k 4 instead of k — 2 may sometimes be unimportant 
by comparison with the gain in information on other points connected with the assay. The 
symmetrical nine-point design, indeed, is a useful arrangement for disclosing any sigmoidal trends. 

In contrast with the logarithmic dose metametcr, the precision of these assays is not necessarily 
increased by the choice of a strain of test organism which gives a greater standard regression 
coefficient ; an improvement is only effected by an increase in the maximum expected response 
still on the linear portion of the response curve, and an increase in bg the result of which is merely 
that the same maximum linear response is reached for a lower dose does not affect the precision 
attainable (Wood, 1946^/). 

(c) Quantal Responses. — The dependence of the weighting coefficient upon the expected 
response makes the efficient planning of assays based upon quantal responses more difficult than 
those discussed in ia) and ib). Again the variance of M will be least when 

M ^ Xg ~ Xty 

and any previous information on M should be utilized so as to attempt to satisfy this condition ; 
usually this will be done by choosing corresponding doses of the two preparations which differ 
by a constant amount, believed to be about the value nf M, on the logarithmic scale. If this 
condition could be fulfilled exactly, the variance of A/, equation (24), would be minimized (for the 
probit transformation) by having all dose levels of each preparation close to that which previous 
experience suggests as likely to give 50 per cent, kill, since at this point w is greatest. But if 
information as complete as this existed, the assay would not be needed ! Conversely, the less the 
previous knowledge about the potencies of the two preparations, the more likely is A4, determined 
from the assay, to differ appreciably from the difference in mean log dose; even a small difference 
might contribute largely to the variance of A/ if a very restricted range of doses made S^j. very 
small. 

A middle course must therefore be found between the too restricted dose range of the last 
paragraph and the other extreme of using doses which give either zero or 100 per cent, kills, since 
these will have very small values of w. If expected kills can be kept within the limits of 5 to 95 per 
cent., or say ^ 3-5 to F -- 6*5, the weighting coefficient never falls below one-third of its value 
at 50 per cent. ; very little widening of the limits reduces the weights much more seriously, and at 
1 or 99 per cent, w is little more than one-tenth of its maximum. Often a suitable practical pro- 
cedure will be to use three, four, or five dose levels of the standard preparation equally spaced 
logarithmically over the range expected to give 10 to 90 (or perhaps 5 to 95) per cent, kill, and 
corresponding levels of the test preparation at these amounts increased by the value of Xf which 
previous information suggests. These doses may have to be selected by guess as much as by 
knowledge, but they will be the best that can be chosen on the evidence available ; if so little is 
known of the potencies in advance, only by good fortune will a precise assay be obtained at the 
first attempt. 

.Of course, the observed kills at any dose may deviate widely from their expectations, especially 
when the groups of subjects are small. Indeed, as an extreme instance, only one subject might 
be tested at each dose, though of course many more than live levels would then be needed, and 
every observation would be of zero or 100 per cent. kill. For example, in some techniques for the 
assay of insecticides, the feeding of a specified amount of poison is impracticable, and instead the 
amount ingested by an individual must be measured as the difference between the amount supplied 
and the amount remaining, a quantity which will generally not be the same for any two individuals. 
The statistical technique is not alteied materially, though the estimation of provisional regression 
lines is more difficult and the computations are generally lengthy; the tests, however, arc likely 
to be unsatisfactory unless some grouping is adopted (Garwood, 1940; Finney, 1947). 

8. Summary 

Every biological assay depends upon a response curve, which is a regression relationship 
between the dose of a stimulus applied to a test subject and the magnitude of the expected response; 
for one material to be assayable in terms of the other, the two regression equations mu«t be 
identical except for a multiplicative factor in the dose scale, which factor is the potency of the one 
preparation relative to the other. The central statistical problem of biological assay is the esti- 



1947] Finney — Principles of Biological Assay 75 

mation of this relative potency from experimental results and the assessment of its precision. In 
most types of assay the regression equations must first be estimated, and usually the data must 
also be tested for their agreement with the basic assumption that the true equations are identical. 

In this paper an attempt has been made to show the unity of all assay problems. The statistical 
analysis of a large and important class can be simplified by means of transformations of dose and 
response which make the regression relationship between the transformed quantities linear. The 
maximum likelihood equations for the estimation of relative potency in these assays have been 
developed in general form, and a computational procedure has been described by means of which 
their solution becomes a process of successive approximation to the linear regression. The method 
is a generalization of the familiar technique of probit analysis, and includes all the simple types of 
macro- or microbiological assay, based either upon quantitative or upon quantal responses, in 
regular use to-day. It also includes various transformations for quantal responses which have 
been proposed as alternatives to the probit. The maximum likelihood estimation procedure 
follows the same pattern for all of these, so that when once certain preliminary tables have been 
prepared, none has any advantage in respect of ease of computation. The only serious competitor 
of the probit is perhaps the logit, based upon the logistic curve; a little consideration shows 
that the apparent advantages of this arc largely due to a misconception of the basis of the trans- 
formation. Nevertheless, the logit possesses some interest, and tables to assist its use, analogous 
to the familiar tables for the probit, are presented together with a numerical example of the 
computations. The regression procedure is shown also to be applicable to the formation of the 
maximum likelihood estimate of a bacterial population from a dilution series. 

The last section of the paper is concerned with the design of assays, especially in respect of 
the choice of the number and location of dose levels and of the. number of test subjects at each 
dose. Usually an assay is limited to a specified total number of subjects, and the experimenter 
has to distribute these in the best possible manner between several doses of the standard and test 
preparations. The design from which the most precise estimate would be obtained generally 
does not permit a sensitive test of the validity of the assay ; a compromise must then be sought 
which provides an adequate validity test without requiring too great a sacrifice in the main objective, 
the precise estimation of potency. Tables are given to show the reduction in precision consequent 
upon an increased number of dose levels. Three or more points must be determined for each 
regression line if any test of linearity is to be made, and the choice of doses can be much assisted 
by any previous information on the potency. There is seldom any reason for departing from a 
symmetrical design using equal numbers of test subjects at corresponding doses of the standard 
and test preparations. 


Acknowledgments 

This paper owes much to discussion and correspondence on the nature of biological assay 
with many who are directly concerned in assay practice. Particularly am I indebted to Dr. 
E. C. Wood, whose stimulating suggestions and whose refusal to be satisfied with any incomplete 
answer have assisted the clarification of many problems. 

Table 3 is taken, with consent of the authors, from Table XII of Standard Four-Figure Mathe- 
matical Tables by L. M. Milne-Thomson and L. J. Comrie; a trivial change of the argument 
and the addition of 5 to the tabular value are the only alterations. 


References 

Bacharach, A. L. (1945), “Biological assay and chemical analysis,” Analyst, 70, 394-403. 

Coates, M. E., and Middleton, T. R. (1942), “A biological test for vitamin P activity,” Biochem. J., 

36, 407-12. 

Barkworth, H., and Irwin, J. O. (1938), “Distribution of coliform organisms in milk and the accuracy 
of the presumptive coliform test,” J. Hyg., Camh., 38, 446^-57. 

Berkson, J. (1944), “Applications of the logistic function to bio-assay,” J. Amcr. statist. Ass., 39, 357-65. 
Bliss, C. 1. (1934flr), “The method of probits,” Science, 79, 38-9. 

(1934/>), “The method of probits — a correction,” ibid., 79, 409-10. 

“ (1935«) “The calculation of the dosage-mortality curve,” Ann. appl. Biol., 22, 134-67. 

(I935/>), “The comparison of dosage-mortality data,” ibid., 22, 307-33. 

(1940), “Quantitative aspects of biological assay,” J. Amer. pharm. Ass., 29, 465-75. 

(1946), “An experimental design for slope-ratio assays,” Ann. math. Statist., 17, 232-7. 



76 


[No. 1 


Finney — Principles of Biological Assay 


Amer. J» 


Bliss, C. 1., and Cattcll, McK. (1943), “Biological assay,” Ann. Rev. Physiol., 5, 479-539. ^ 

and Packard, C. (1941), “Stability of the standard dosage-effect curve for radiation, 

Roentgenol., 46, 400^. . 

Burn, J. H. (1930), “The errors of biological assay,” Physiol. Rev., 10, 146-69. ^ n 

dc Beer, E. J. (1945), “The calculation of biological assay results by graphic methods. The all-or-nohe 

type of response,” y. 85, 1-13. x a a c 

Emmens, C. W. (1940), “The dose response relation for certain principles of the pituitary gland, and ot 
the serum and urine of pregnancy,” J. Endocrinol., 2, 194-225. 

Fcchncr, G.T. Elemente der psychophysik. Leipzig: Breitkopf und H artel. , - . 

Fieller, E. C. (1940), “The biological standardization of insulin,” J. Roy. statist. Soc. Suppl., 1, 

(1944), “A f^undamental formula in the statistics of biological assay, and some applications, Quart. 

J. Pharm., 17, \\7-23. 

(1947), “Some remarks on the statistical background in bio-assay,” Analyst, 72 (in the press). 

Finney, D. J. (1944), “The application of the probit method to toxicity test data adjusted for mortality m 
the controls,” Ann. appl. Biol., 31, 68-74, 

(1945), “The microbiological assay of vitamins : The estimate and its precision,” Quart. J. Pharm., 


18, 77-82. 

— (1947), Prohit Analysis: A Statistical Treatment of the Sigmoid Response Curve. London : Cambridge 

University Press. 

Fisher, R. A. (1922), “On the mathematical foundations of theoretical statistics,” Phil. Trans., a, 222, 309-68. 
- (1925), “Theory of statistical estimation,” Proc. Camh. phd. Soc., 22, 700-25. 

(1935), “Appendix to Bliss, C. I.: The case of zero survivors,” Ann. appl. Biol., 22, 164-5. 

— and Yates, F. (1947), Statistical Tables for Biological, Agricultural and Medical Research (3rd ed.). 


Edinburgh : Oliver & Boyd. 

Gaddum, J. H. (1933), “Reports on biological standards. 111. Methods of biological assay depending 
on a quantal response,” Spec. Rep. Ser. med. Res. Coun., Land., no. 183. 

Garwood, F. (1940), “The application of maximum likelihood to dosage-mortality curves,” Biometrika, 


32, 4(^58. 

Gridgeman, N. T. (1943), “The technique of the biological vitamin A assay,” Biochem. J., yi, 127-32. 

Halvorson, H. O., and Ziegler, N. R. (1933), “Applications of statistics to problems in bacteriology. I. A 
means of determining bacterial population by the dilution method,” J. Bact., 25, 101 21. 

ipsen, J. (1941), Contribution to the Theory of Biological Standardization. Nyt Nordisk Forlag, Arnold 
Busck, Copenhagen. 

- and Jerne, N. K. (1944), “Graphical evaluation of the distribution of small cxt^erimcntal series,” 

Acta path, microbial . Scand., 21, 343-61. 

Irwin, J. O. (1937), “Statistical method applied*to biological as.says,” J. Roy. statist. Soc. Suppl., 4, 1-60. 

Kent-Jones, D. W., and Mciklejohn, M. (1944), “Some experiences of microbiological assays of riboflavin, 
nicotinic acid and other nutrient factors,” Analyst, 69, 330-6. 

Knudsen, L. F. (1945), “The use of statistics in biological experimentation and assay,” J. Assoc, off. agric. 
Chem., Wash., 28, 806-13. 

and Randall, W. A. (1945), “Penicillin assay and its control chart analysis,” J. Bact., 50, 187-200. 

Martin, J, T. (1940), “The problem of the evaluation of rotenone-containing plants. V. The relative 
toxicities of different species of derris,” Ann. appl. Biol., 27, 274-94. 

( 1 942 ), “The problem of the evaluation of rotenone-containing plants. VI. The toxicity of l-elliptone 

and of poisons applied jointly, with further observations on the rotenone equivalent method of 
assessing the toxicity of derris root,” ibid,, 29, 69 -81. 

Miller, L. C., Bliss, C. I., and Braun, H. A. (1939), “The assay of digitalis. I. Criteria for evaluating 
various methods using frogs,” J. Amer. pharm. 28, 644-57. 

Swaroop, S. (1938), “Numerical estimation of B. coli by dilution method,” Ind. J. med. Re.s., 26, 353-78. 

Tatterstield, F., and Potter, C. (1943), “Biological methods of determining the insecticidal values of 
pyrethrum preparations (particularly extracts in heavy oil),” Ann. appl. Biol., 30, 259-79. 

Trevan, J. W. (1927), “The error of determination of toxicity,” Proc. Roy. Soc., b, 101, 483-514. 

Urban, F. M. (1909), “Die psychophysichen Massmethoden als Grundlagen empirischer Messungen,” 
Arch. ges. Psychol., 15, 261-355. 

(1910), “Dip psychophysichen Massmethoden als Ciriindlagen empirischer Messungen (continued).” 

/W., 16, 168 227. 

Wilson, E. B., and Worcester, J. (1943«), “The determination of LD50 and its sampling error in bio-assay,” 
Proc. nat. Acad. Sci., Wtish., 29, 79-85. 

(19436), “The determination of LD50 and its sampling error in bio-assay: II,” ibid., 29, 1 14- 20. 

. (1943c), “Bio-assay on a general curve,” ibid., 29, I5()-4. 

Wood, E. C. i\9A4a), “Mathematics of biological assay,” Nature, Land., 153, 84-5. 

— (19446), “Mathematics of biological assay,” ibid., 153, 680-1. 

(1946«), “The theory of certain analytical procedures, with particular reference to micro-biological 

assays,” Analyst, 71, 1-14. 

(19466), “Computation of biological assays,” Nature, Loud., 158, 835. 

and Finney, D. J. (1946), “The design and statistical analysis of micro-biological assays,” Quart. 

J. Pharm., 19, M2-27. 



1947] 


Finney — Principles of Biological Assay 


77 


APPENDIX (added after the Meeting) 

A Further Note on Assay Design, Efficiency, and Reliability 

Since this paper was completed, it has been pointed out to me that the discussion of assay 
design in Section 7 underestimates the ill-effects of dividing a specified total number of test subjects 
between many doses. The results given in that section are true without qualification provided 
that b (or bg) is very large compared with its standard error. On the other hand, if this condition 
is not satisfied, any increase in /c, the number of dose levels, causing an increase in the variance 
of the regression coefficient, may so increase the quantity g that the fiducial range assigned to the 
relative potency is widened to a greater extent than is indicated by the “efficiencies” listed in Tables 
9 and 10. In this Appendix the effect on the fiducial range of changes in g, induced by changes 
in k, will be briefly considered. 

For quantitative responses and a logarithmic dose metametcr, the efficiency of a symmetrical 
2A-point assay relative to a four-point has been expressed by equation (68). This expression 
involves the use of equation (24) for the variance of M. Since equation (24) can be used in assess- 
ing the fiducial limits for M only when g is small, a more reasonable measure of the reliability 
with which M can be estimated is perhaps given by consideration of the magnitude of the fiducial 
range. If iN subjects are assigned to each preparation, expression (27) shows that 

Square of fiducial range oc ^ -f ~ {\ ~ gy. 

For the symmetrical 2/r-point assay 


whence, from equation (26) 


Nd ^ k f \ 

12 ^ /t - V 
^ ~ A + 1 ^ Nd^b^ 


(73) 


If the value of / is chosen to correspond with a selected probability, the second factor in this 
expression for g is unaffected by changes in k (except for sampling variation). An increase in k 
might also alter the subdivision of degrees of freedom in the analysis of variance and allow fewer 
degrees of freedom for the estimation of s^y so increasing t\ when N is moderately large, however, 
a slight change in the degrees of freedom has little effect on r. From equation (73), as k increases 
from 2 to a large number, the value of g is almost trebled, and this may result in a widening of the 
fiducial range by a greater amount than equation (68) suggests. In fact, by substitution in the 
earlier formula^ 


where 

and 



If A is zero, the ratio of the squares of the fiducial ranges for k ~ 2 and for any other value of 
k is the same as the “efficiency” of the 2/c-point assay stated in equation (68) and tabulated in 
Table 9. The introduction of A into equation (74) shows how the fiducial range is affected by 
the increased variance of b resulting from increase in k ; the ratio of this expression to its value for 
k = 2 may be termed the reliability of the assay. Table 1 1 shows the percentage reliability of the 



Finney — Principles of Biological Assay [No. 1, 

/2^-point assay, relative to the corresponding four-point, for a series of values of A and two 
extreme values of {M — Jc# -f xt^jd. Since A is 3. function of /, the reliability as here defined 
depends upon the level of probability chosen for determining the fiducial limits; the efficiency 
(Table 9) may be regarded as the limit to which the reliability tends as the probability is increased to 
unity and t therefore decreased to zero. 


Table 11 

Percentage Reliability of Symmetrical Ik-Point Assay Relative to Four-pointy Derived from 

Equation (74) 

(i) (M - -f Xi)ld 0 


Values of A [equation (75)] 


k 

0 

0-2 

0-4 

0-6 

0-8 

10 

2 

100 

100 

100 

100 

100 

100 

3 

1 100 

96 

92 

88 

82 

75 

4 

i 100 

94 

88 

80 

71 

60 

5 

; 100 

93 

85 

75 

64 

50 

10 

100 

90 

78 

64 

47 

27 

(Ti 

1 100 

86 

69 

50 

27 

0 



(ii) (M 

— Xg xt)/d 

- 0 3 






Values of A [equation (75)] 



k 

0 

0-2 

04 

0-6 

0-8 

10 

2 

100 

100 

100 

100 

100 

100 

3 

96 

92 

87 

82 

75 

67 


94 

87 

80 

71 

61 

48 


92 

84 

75 

64 

52 

37 

10 

89 

78 

65 

49 

32 

14 

00 

86 

70 

53 

33 

13 

0 


Inspection of Table 1 1 indicates that in general there will be no very serious loss through 
using three levels of dose instead of two, unless /I, a characteristic of the assay technique and test 
subject (and of the selected level of probability) is large. Providing that A is less than 0-4, use 
of as many as five dose levels instead of two will at worst reduce the reliability to 75 per cent, 
of its maximum, unless an unfortunate choice of doses leads to an exceptionally large value of 
(A/ Xf, Jf/). This, admittedly, compares unfavourably with the statement in Table 9 that 
for k 5 and {M — Xg i xt)ld 0-3 the loss of efficiency is only 8 per cent. ; nevertheless, the 
increase in the fiducial range corresponding to the values in Table II (a maximum of 15 per cent, 
of its value for k = 2) may be a small price to pay for data leading to adequate tests of assay 
validity. The contention that at least three dose levels of each preparation should be used, unless 
there is practical certainty of linearity, may still be maintained, for even a 33 per cent, loss in 
reliability (for example, with (M ~ Xg xt)ld = 0-3, A ==10) is likely to be preferable to the 
absence of any linearity test. 

A similar argument may be developed for the situation in which the dose metameter is a power 
of the measured dose, and, in particular, when the dose itself may be used as the metameter. 
From (32) above, 

c* 

Square of fiducial range oc [vt — 2Rc 4- R^Vg — g(vt -- — )] (1 — gf. 



1947] 


Finney — Principles of Biological Assay 


79 


In a symmetrical (2k 4- l)-point assay, in which the range of doses for each preparation is from 
0 to d, it may be shown (Wood and Finney, 1946) that 


and 

whence, using e<)uation (33), 


^ _ 3^(5^* + -f 2) 

*'* *’* NdH.k + I) {k* + “* + 1) 

9k* 

N(P(k* + k~+ D’ 


_ k(5k* + 5* + 2) 34-’*/* 

^ + j) {k* +: A 1) ^ Nd*h* 


(77) 


As in equation (73), the second factor in equation (77) is independent of ky except for sampling 
variation and possibly small changes in / due to alterations in degrees of freedom. This expression 
for g increases by a factor of 2-5 as k increases from 1 to infinity. Substitution for vv, c and g 
now gives 


Square of fiducial range oc ^ 


where 


A 




(78) 

(79) 


^(5^2 + 5k H- 2) 

"■ {k {- 1)(A« + A + D’ 

^ 6A* 

k^ 4- A r 

^ 4A* (2A + 1)* 

(A f DMA" + A 1)’ 


In Section 7, the efficiency of the symmetrical (2A + D-point assay was expressed relative to 
an asymmetrical three-point assay with the subdivision of N test subjects specified by equation 
(69). This subdivision was determined so as to minimize the variance of R expressed in equation 
(31); when R is near to unity, it assigns very few subjects to zero dose, and thus makes the pre- 
cision of estimation of the regression coefficients very much lower than for a symmetrical three- 
point design. For many types of microbiological assay, .v", the variance of responses about their 
regression lines, is very small ; with a moderate level of probability, even a reduction to a single 
tube at zero dose may then not make g sufficiently large to affect the use of equation (31) in assess- 
ing fiducial limits. (Nevertheless, to assign no tubes to tests at zero dose, as would be required 
by equation (69) with = 1, would clearly make the whole assay useless.) For these assays, the 
efficiency, given by equation (72) and Table 10, is also a satisfactory measure of the reliability. 
For other classes of assay in which the response is directly proportional to the dose, and when R 
is near to unity, the effect of using the three-point assay specified by equation (69) may be so to 
increase the variance of bg that g becomes large; equation (31) is then not adequate as a means of 
assessing the hdueial range. In such assays it is scarcely of practical importance to diseuss the 
subdivision of subjects given by equation (69), as this would never be used; more usefully, perhaps, 
the efficiency of the {2k + l)-point design may then be expressed relative to the symmetrical 
three-point, which involves replacing equation (72) by — 


, . 200(R^ _l_ 1) -I- 1) k -i 1) 

^ k[(iP 4- I j (5A" -i- 5k + 2) 6Rk(k 4- 1)] 


. (80) 


Table 10 would be replaced by Table 12, and, over the range studied, the efficiency so defined 
is seen to be practically independent of R, 



80 


Finney — Principles of Biological Assay 


[No. 1, 


Table 12 

Upper Limit to Percentage Efficiency of Symmetrical Common-Zero (2k: 4' \y Point Design 
Relative tg Symmetrical Three-pointy Calculated from Equation (80) 

(g assumed to remain small) 

Values of R 


k 

i 

09 

0*8 

07 

1 

100 

100 

100 

100 

2 

I 75 

75 

' 74 

74 

3 

67 

67 

66 

65 

4 

1 62 

62 

62 

61 

5 

1 ^ 

60 

59 

58 

10 

55 

1 

55 

54 

54 

C30 

50 

50 

49 

49 


When g cannot, be neglected, the reliability may be defined as before by means of ratios of 
squares of fiducial ranges obtained from equation (78). This reliability becomes the same as 
the efficiency stated in equation (80) when A is put equal to zero; the introduction of non-zero 
values for A makes allowance for the increased variance of bg resulting from an increase in k. 
The percentage reliability of the (Vc 4- l)-point assay relative to the three-point is given in Table 
13 for a scries of values of A and two extreme values of R. As for the logarithmic dose meta- 
meter, >4 is a function of t and the reliability depends upon the level of probability chosen for 
the fiducial limits; again, the efficiencies of Table 12 are limits to which the reliabilities tend as 
the probability is increased to unity and t decreased to zero. 

Table 13 

Percentage Reliability of Symmetrical Common-Zero (2k -t- \)-Point Assay Relative to 
Symmetrical Three-pointy Derived from Equation (78) 


(i) /? = 10 

Values of A [equation (79)] 


k 1 

0 

004 

008 

012 

0 16 

0-20 

1 

100 

100 

100 

100 

100 

100 

2 

75 

71 

66 

60 

52 

42 

3 

67 

61 

55 

47 

37 

24 

4 1 

62 

56 

49 

40 

29 

16 

5 

60 

54 

46 

36 

25 

11 

10 

55 

48 

39 

29 

16 

.4 

i 

00 

50 

42 

33 

22 

9 

0 




(ii) R = 0-7 







Values of A [equation (79)] 



k 1 

0 

004 

008 

012 

016 

0-20 

1 

100 

100 

100 

100 

100 

100 

2 I 

74 

70 

66 

62 

56 

48 

3 

65 

61 

56 

49 

41 

30 

4 

61 

56 

50 

43 

34 

22 

5 

58 

53 

47 

39 

30 

17 

10 

54 

48 

41 

32 

21 

8 

00 

49 

42 

35 

25 

14 

0 



1947] 


Discussion on Mr. Finney^ s Paper 


81 


Between /? = 1*0 and R — 01 the reliability is practically independent of R. Table 13 
confirms the indications of Tables 10 and 12 that the increase in the fiducial range consequent 
upon an increase in k is more serious for these assays than for assays with a logarithmic dose 
metameter. Unless there is almost a certainty of linearity over the whole range of doses, however, 
some test of linearity must be included and ^ ^ 2 is the least value to be considered; the further 
loss when k is increased to 4 is not great, providing that A is small (though it may be important 
for large A)^ and is often a useful insurance against the possibility that the highest doses are beyond 
the range of linearity. In any satisfactory microbiological assay, c/6, is likely to be at least of the 
order of 5 or 10 times s (in the nicotinic acid assay discussed in Section 4, for example, about 
55 times) ; inspection of equation (79) shows that A will then normally be below 004, and probably 
below 0 01 . For this most important class of slope-ratio assays, the reliability of the symmetrical 
five-point design is about 75 per cent, of that for the symmetrical three-point, essentially the same 
result as for the efficiency, and there is a further loss of about 15 per cent, if a symmetrical nine- 
point design is used; these figures represent increases of about 15 per cent, and about 30 per cent, 
in the fiducial ranges by comparison with the symmetrical three-point. 

No detailed discussion of the fiducial range for a potency estimate from an assay based on 
quantal responses will be presented here. Considerations arise similar to those for quantitative 
responses with a logarithmic dose metameter; an increase in k may again increase g and con- 
sequently widen the fiducial limits to a greater extent than mere consideration of variances derived 
from equation (24) would suggest, but the situation is complicated by the differential weighting 
at different levels of mortality. Once again three levels of dose for each preparation, rather than 
two, are practically essential, unless there is almost a certainty of linearity; whether or not more 
are used in any assay must depend upon the particular circumstances, upon existing knowledge 
in respect of linearity, slope, and variability, and finally upon the discretion of the experimenter. 

DISCUSSION ON Mr. FINNEY’S PAPER 

Mr. E. C. Fieller : Mr. Finney has given us a stimulating synthesis, with some elegant and 
useful additions of his own, and it is a pleasure to propose this vote of thanks to him. He has 
written, he says, for statisticians rather than for biologists or executives; I find the boundaries of 
these three classes increasingly difficult to trace, but in any case I am sure that Mr. Finney will 
deservedly have a wider public than he claims. 

Although the paper stresses the formal analogy between assays in which the response is quantal, 
and those in which it is continuous, I find it more convenient to discuss them separately. One of 
the most surprising features of the statistical theory of quanta 1-response assays is that although 
it can be formulated quite concisely, as Mr. Finney has shown, it nevertheless took some sixteen 
years to reach its present refined state. Trevan and Gaddum, I think, broke the back of the 
problem ; in 1927 Trevan showed that in a quantal-response assay the problem is that of comparing 
two frequency distributions of individual effective doses, and in 1933 Gaddum reduced it to the 
weighted-regression form by using as ordinate the equivalent deviation, in place of the percentage 
reacting. In 1935 Bliss, writing from the Galton Laboratory, inverted the Working-Hotelling 
argument to obtain fiducial limits for the dose corresponding to an arbitrarily-fixed percentage- 
reacting; at the same time Fisher applied the method of maximum likelihood to deal with the 
extreme cases in which, in one or more of the dosage-groups, the observed percentage-reacting is 
0 or 100. Irwin’s discussion in 1937 led, as Bliss has said, to a reconsideration of the problem, 
and in 1938 the full maximum likelihood method for dealing with a single dosage-mortality line 
was announced. Irwin and Cheeseman in 1939, and Garwood in 1940, gave a mathematical 
discussion of this case; the former authors dealt also with the assay problem. We had to wait 
until 1943 for a specific discussion, by Irwin, of the fiducial limits for a quantal-response assay, 
although it might be claimed, 1. suppose, that the result happened to be contained in a more 
general one that I had given in 1940. 

In 1934 or 1935 something else happened that seems to me less useful. Gaddum’s normal 
equivalent deviations were increased by 5, and rechristened probits. Mr. Finney has dutifully 
done the same for the logistic equivalent deviation, but I doubt whether it is worth while. He 
avoids negative numbers in his illustrative Table 5 (but not in his basic Table 4) at the expense 
of writing more digits ; I think that I would myself have introduced negative numbers into the 
example anyway, by using —1,0 and f 1 as working abscissae. For routine calculations I have 
found it helpful to form a table giving w and wy^ rather than w and y^ and here the addition of 5 to 
the equivalent deviation would be a real inconvenience, since it would double the size of such a table. 

SUPP. VOL. IX. NO. 1. G 



82 


[No. 1, 


Discussion on Mr, Finney* s Paper 


I should like to mention here one difficulty that I feel about quantal-rcsponse theory, even 
though it may be of little practical importance. It is customary in such assays to calculate the 
hducial limits from a formula analogous to Mr. Finney’s equation (27), as he had done in his 
Section 5. Now when we are dealing with a continuous-response as^y this formula can be 
rigorously established ; we need only consider, in the light of the experimental data, what range 
of values we can accept for a quantity estimated as the ratio 

yB-n 

h 

where the numerator and denominator are both linear functions, with constant coefficients, of the 
observed responses. In the quantal case, however, there is the added complication that our 
ratio involves not only the adjusted equivalent deviations but also the estimated weights, the 
sampling errors of which are ignored in the usual derivation of the formula for the fiducial liniits. 
Mr. Finney’s formula (27) can of course be written in a variety of ways ; I have found it convenient 
to put (in his notation) 

^ 1 - g b’‘-t^y(hy 

SO that the limits are M' Jb where 


and 




, /Iv* r 1 


+ 



C 

Sxjr 



An attractive alternative form for this last equation, 

^ {k + ■ J /";) + ^ (■^* ] ’ 

was suggested to me by a remark of H. P. Marks, who will be well remembered as an inspiring 
and very practical-minded worker in the field of biological assay. 

In discussing quantitative-response assays, Mr. Finney has recalled that in the symmetrical 
four-point assay the assumption of a parabolic log (dose)-response line, instead of a straight one, 
leaves the formulae unaltered. It seems l>est to regard this equivalence as a fluke. If the doses 
of test are in the same ratio as the doses of standard, then whether the dosage-groups are equal 
or not, the parabolic assumption leads us to estimate the log (potency ratio) as 

Urn + yt2) - (yn -h ys2)}l{{yti - rn) *f (r*i - y»2)}, 
multiplied by the appropriate scaling factor. Here and ysz are the mean responses to the higher 
and lower doses of the standard, and yn and ytz have similar meanings for the test preparation. 
It is easily seen that if the variance is independent of the dose, there will in general be a correlation 
between the numerator and denominator of this fraction, which will modify the calculation of 
fiducial limits, and that the formulae for interpreting the assay will coincide with those derived 
from the assumption of linearity only when the four dosage-groups are all equal. Some years ago 
I used this modified procedure on one set of data, but only one ; 1 finished up with wider fiducial 
limits than I would have had, if I had rejected the dose that appeared to fall in the flat portion of 
the log (dose)-response curve. This is of course no basis for argument, but I still class the pro- 
cedure as one that is nice to know, but dangerous to use ; it seems desirable at any rate, if one 
does use it, to verify that the turning point of the resulting parabola lies outside the relevant 
dose-range, so that Mr. Finney’s Condition I is not violated. 

As Mr. Finney has mentioned in a footnote to his third section, I prefer myself, when dealing 
with an assay method using quantitative responses, to begin by finding a response metameter 
that equalizes the variance, rather than one that straightens the log (dose)-response line. Of 
course, if there is a metameter that does both jobs (and this is what is assum^ of the metameter, 
in applying the standard procedure), the route by which we arrive at it does not matter. The 
difference between our approaches is due, I think, to the fact that Mr. Finney has primarily con- 
sidered the needs of the research worker who wants, at the end of his experiment, to make ffill 
use of all the information he has obtained ; whereas I have had in the past to consider the needs 
of the laboratory engaged in routine assay, where the objective must be to ensure beforehand that 
every animal used will supply a reasonable amount of information. 1 believe that in most macro- 
biological assay-methods we shall find, if we measure responses on a scale that stabilizes their 
variant, that the log (d<^)-responsc line is logistic in form, with the steep central portion effec- 
tively linear. Outside this portion changes in dose will produce relatively little change in response, 



19471 Discussion on Mr, Finney's Paper 83 

so that the regular occurrence of curved log (dose)-response lines is an indication that some of 
the animals are being used inefficiently. 

I want, in conclusion, to make one general point. I think that it is the business of the mathe- 
matical statistician to provide objective methods of drawing conclusions from experimental 
data, and for that reason I hope that none of Mr. Finney’s readers will imagine that he advocates 
graphical methods as more than a preliminary to doing the sums. There is not much point in 
doing an assay unless some decision has to be based on its outcome, and the decisions that have 
to be made are usually of the type : “What is a fair price for this oil?” “What is the therapeutic 
value of this preparation?’’ “How much ought we to dilute this concentrate?’’ “Does this prepara- 
tion conform to specification?’’ I believe that rational methods of answering such questions can 
be given in terms of estimates of activity and fiducial limits, and not in any other terms, and accord- 
ingly that the calculation of these quantities should be regarded as an integral part of the assay 
procedure. 

T have much pleasure in proposing a vote of thanks to Mr. Finney for his informative and 
stimulating paper. 

Dr. Irwin : Before seconding this vote of thanks, may 1 say what a pleasure it is to see here 
two of the real pioneers of the subject. Sir Percival Hartley and Dr. Trevan. 

It was the stimulus of Professor Gaddum’s now famous report on biological assay using 
quanta! responses which first led me to get interested in the subject, and so to the survey which I 
gave this Society 10 years ago. I little imagined at the time that interest would spread in the way 
in which it has. In their review in 1943 Bliss and Cattel listed 275 papers, and there must be many 
more by now. 

To Mr. Finney we are enormously indebted for re-stating lucidly the old points and intro- 
ducing us to the new. His distinctive contributions are the general treatment of the problem 
of transforming the response curve into a straight line and the work on microbiological assays 
where the response is linearly related to the mctameter. He has dealt with both problems in an 
extremely able way; I particularly admire the ingenuity of using a double regression function to 
express simultaneously the arithmetic relations in test and standard. 

1 quite a^ee with Mr. Finney that approximate methods of evaluating the result of an assay 
are often quite good enough for routine purposes, also that drawing a graph of the data is always 
an instructive procedure. 1 do not, however, think that a graphical method should be used 
for routine estimation; some simple method such as Behren’s or Karber’s in the quantal 
case is more satisfactory because it can be agreed upon by several workers who would get the 
same result for the same data. 

In the purely formal treatment of the general problem of rectification on p. 51 it was unneces- 
sary for him to assume that the variances of the untransformed response are the same at all 
dosage levels. If they are not, the effect is simply to introduce 1 /a* under the summation signs 
in equation (17) with a consequent modification of the weights in equation (19), otherwise 
everything is unchanged. This brings the treatment of quantitative and quantal responses 
completely into line with one another. There remains the difference that in the quantitative 
case we can actually examine the variances of the transformed variate. If they are found 
not to differ significantly at the different dosage levels, unequal weights and successive approxi- 
mations are unnecessary, as Mr. Finney has indicated. 

One always hopes that this will prove to be so; the successive approximation procedure, which 
one is bound to employ in the quantal case, is rather tedious ; I have found many cases in which 
four approximations are necessary to get 2 decimal accuracy in and the worse the data the 
more the number of approximations usually required. The example in Table 5 is, compared with 
my general experience, more than usually favourable. 

I agree with Mr. Finney that the logistic response curve has no advantage over the normal 
for quantal responses, but it may occasionally be useful for continuous responses, and Dr. Emmens 
has employed it for that purpose. 

I have sometimes' wondered* whether the introduction of the probit was any improvement 
over the original normal-equivalent deviation. We avoid negative quantities at the expense of 
using larger numbers, and double the number of entries necessary in a table of maximum or 
minimum working probits. Also the habit of adding 5 seems to necessitate a new jargon. Statis- 
ticians brought up in the Victorian tradition liked using long words with Greek derivations. 
For example, the late Professor Karl Pearson introduced, the term “heteroscedasticity’’ for unequal 
scatter. Now the pendulum has swung the other way. We have probits and logits and rankits. 
What ^e we to call the equivalent deviation of Section 6? The term logit being already appro- 
priated, should it be an expitl There is a danger that the unitiated may come to think that they 
arc all rackits. I would put in a plea for the simple abbreviation E.D. in all cases. 



84 


[No. 1, 


Discussion on Mr. Finney's Paper 

Before I leave quantal responses may I pay a tribute to Professor Gaddum s intuition. In 
his 1933 paper he introduced the correct test (at any rate the best test we have up till now) ot good- 

ness of fit by taking x* = — which fits in with the maximum likelihood solution. 

Actually there remains the point that the distribution of this x*. is not precisely the same as 
that tabulated, since we are dealing with binomial, not normal distributions, and if we have, say, 
only 3 doses with one very large and another very small expected mortality, the disturbance may 
be appreciable. We discussed this point ten years ago, and I don’t think it is quite settl^ yet. 
Bliss thought that for very small and very large doses the contribution to x® would tend to be too 
small and 1 thought the exp>ectation would be correct. On p. 66 Mr. Finney says x* niay be 
unduly large. I think that we were all three right. I have worked out the distribution of the 
contribution to x* for a mortality of 0*05 and 10 animals. In 91 per (^nt. cases the contribution 
is *526 ; in the remaining 9 per cent, it exceeds 4*7. Its expectation is unity. Thus the contribution 
will most often be too small, but it will also be too large more often than the tables of x* allow 
for; were the distribution of the square root (x) of the contribution normal, the former would 
exceed 4*7 in only 3 per cent, of cases. It is the intermediate values that are lacking. Just after 
working this out, I received the results of a number of assays of an antigen with which we have 
lately been concerned. There were 6 experiments, comprising 26 tests, and 3 doses with 1 5 animals 
each to a test. There were no significant differences in siope, and the analysis of variance of the 


26 slope estimates is as follows : 

5. o/S. 

D.F. 

Mean Sq. 

Between experiments 

2*6387 

5 

0-5278 

Within f Between tests 

191364 

20 

0-9568 

Experiments \ Within tests . 

31-5252 

29 

1-0871 


The mean square between experiments is somewhat below its expected value but not significantly 
so. However, the striking feature of the table is the closeness of the mean squares to unity, the 
theoretical value which they should have if the x* distribution holds. 

May 1 now make a remark about the four point assay with equal dose intervals and equal 
weights mentioned on p. 57. It is easy to show that the contribution to the sum of squares in the 
analysis of variance of the responses made by the quadratic term is precisely the same as that due 
to the difference between slopes when the quadratic term is ignored. In other words if linearity is 
assumed, the assay provides a test of parallelism; if parallelism is assumed, the assay provides a 
test of departure from linearity. The position is much clarified if the two parabolas arc fitted in 
orthogonal form. If we give 1 and 2 units of standard and 1 and 2 assumed units of test, or 
constant multiples of these doses, and take our origin mid-way between the upper and lower doses, 
the two response curves arc easily found to be — 

Standard Y - y f hx - \ Mb 4 f [{x — iMf — | (I f M^)]- 
Test r - y I- bx ^ {Mb I r {{x + \Mf - { (1 + 
with b-=ii-y^ >2 Vg 1 y^), Mb - \ {-y^~ y, + Vg 4 y^) 

-^2 - . V 3 +>’ 4 ) 

or y^y i (Mb) 

where we have 

y 5 . h 5 , 5 , 

.Vi 1 i i +1M 

y2 1 i 1 i -iM 

y, 1 -1 11 IM 

r. 1 il i-i ;-iM 

whence, by the usual rule, we may write 
. . r iM(y, - .»>, -y, (yJ/M^ 
and its contribution to the sum of squares 
• . ii.Ki-y. -.Va i-.l’i)* 

In actual practice the most important thing is to make sure that conditions I and 11 are satisfied. 
Even then there may be complications in practice. For example, 0-6 pg of p carotene is by 
definition equivalent to 1 unit of vitamin A, and the two are in fact equivalent for rats, but they 



1947J Discussion on Mr. Finney's Paper 85 

are anything but equivalent for man, who only utilizes about \ of the ^ carotene. Similarly 
vitamin D and vitamin D3 are equivalent for rats, but anything but equivalent for chickens. 

In such cases writers should be careful about the terminology they use ; otherwise much con- 
fusion can result. 

I have very great pleasure in seconding the vote of thanks to Mr. Finney for a stimulating and 
original paper. 

Sir Percival Hartley said it would be impossible for him adequately to discuss the paper. 
He had read it with such intelligence as he possessed, and in course of time he thought he would 
be familiar with it, but he was one of those people present who belonged to biology and not to 
statistics. He felt that it was a great privilege to be there among the advance guard of that new 
scientific activity. He seemed to have spent the greater part of his own time with the mopping-up 
party in the back areas, dealing with questions which he supposed those present considered to be 
settled and done with, and not requiring any further attention ; nevertheless he still found that 
he had to be quite active in such elementary things as pointing out the importance of material 
standards for this work. Had he known that he was going to be called upon to speak, he would 
have shown an exhibition case which was in his possession containing samples of 35 biological 
standards, representing one line of progress in this field which had been achieved during the inter- 
war period. 

Further, it was still necessary to point out the difficulties which arose in attempting to define 
a unit in terms of animal reaction ; and he had had many discussions in the “back areas” in which 
he had pointed out that the error of a biological assay, which was just as important as the assay 
itself, and was really a part of it, ought to be recognized, estimated and stated. He had also had 
much work at times in pointing out to his biological colleagues that the proper time to consult 
the statistician was before a large piece of work was undertaken, and not when it was finished ; 
and that experimental design was greatly benefited by these preliminary consultations. 

He would like to say what a pleasure it was to him to be at this meeting because his own first 
contacts with this subject dated back to forty years ago, when he was working at the Lister Institute 
as a young scholar. At that time Sir Charles Martin had been Director of the Institute for a 
year or two, and one of his early activities was the preparation of the Report of the Plague Com- 
mission based on the enormous mass of data accumulated in India and in this country. He 
realized that the proper interpretation of the results called for the expert statistician, and it was 
in that connection that Professor Major Greenwood, who came to handle the data, became 
his friend and colleague at the Lister Institute. Personally, after listening to Greenwood, he 
felt that he would always have to examine tables and figures more critically thatt he had ever 
done before. When he recalled those early days at the Lister Institute, and his young con- 
temporaries who enjoyed the same privilege, he thought that it was this kindly interest and 
encouragement of Greenwood which was the beginning of a new activity of which this inspiring 
gathering to-night was but one manifestation. 

It was the starting-point for the journey which had brought them to this gratifying present 
position. He could see some of the milestones along that road. One of the earliest and most 
outstanding was the circumstance, whatever it might have been, which took Professor R. A. Fisher 
to Rothamsted. Fisher would have adorned and enriched any part of the scientific field, in which 
he is a recognized master ; but biologists were extremely fortunate in securing his interest and 
his help in their problems. 

Another milestone on this road was the paper published in 1927 by Trevan in the Proceedings 
of the Royal Society on “The Error of Determination of Toxicity.” This gave a real “shake-up” 
to research workers in the field of serology and immunology ; it made them stop and think what 
the meaning of the terms they used might be. From this many advantages had followed, not the 
least being that first Burn, and then Gaddum, had become interested in this field and they, in 
turn, had aroused the interest and practical assistance of others. Burn and Gaddum had been 
his colleagues at the National Institute for Medical Research at Hampstead, and he had learned 
a great deal from them. Burn was with them at Hampstead for many years before going to the 
College of the Pharmaceutical Society to start the pharmacological laboratory there. Gaddum 
followed Burn at these laboratories, and he thought that a tribute should be paid to that Society 
and to its laboratories because, through the work and leadership of Burn and Gaddum and 
Katherine Coward they had been provided with their raw material in the shape of a magnificent 
quantity of first-rate biological assay data. At Hampstead they had also had the privilege of the 
services of Dr. Irwin as their adviser for a long time, and they had always found him most helpful. 
He owed more than he could ever repay to his intimate colleagues and assistants in his department, 
most particularly to the late H. P. Marks, whose tragic death had robbed them of a first-rate 
worker, to Mrs. Trevan, who was with him for many years during the time his department was 



86 


Discussion on Mr. Finney's Paper 


[No. 1, 


developing rapidly, and to Dr. C. W. Emmens, who only this year had come to his help in con- 
nection with some awkward questions requiring solution* Looking back on these three friends 
of his, he could not help realizing what a great deal they had got out of their lives which he him^lf 
had missed because, while they came into this matter as biologists interested in the actual carrying 
out of assays, they appreciated that they could go much further with their problems by learning 
how to use some of these statistical implements. 

He hoped that, in the future, statisticians would perhaps come in closer touch with biologists 
in the actual conduct of their assays. In that way the statistician might learn something by s^ing 
how biologists did their work, how they sometimes had to face up to rather difficult situations, 
and how they had to obtain answers with small groups of animals. Biologists had had a great 
deal of help from statisticians, but they wanted more. On the other hand, he thought that people 
who began life as he did, and spent most of it in the biological field, would be at a great advantage 
and would reap a rich harvest if, like several of his colleagues, they went on to learn something 
about the use of statistical methods. 

Mr. N. T. Gridgeman said that he spoke as a biochemist, not as a statistician, and the paper 
was so comprehensive in its scope and penetrating in its treatment that he had little to say about 
it. But in connection with Table 9 he wondered whether the author had considered the effect 
on the efficiency of assays of unequal numbers of subjects at the different points. It was possible, 
for instance, to have the extreme dosage levels more heavily weighted than the intermediate levels, 
which would lead to different “efficiencies” than equally weighted levels, although the total weight, 
i.e., the total number of subjects, would be the same in each case. At the same time, it would, 
of course, complicate the tests for linearity, etc. 

Dr. Irwin had mentioned that the vitamin-D situation was complicated by the fact that vitamins 
Dg and Da were equipotent for rats, but not for other animals — for example, chickens. He 
believed that they were also inequipotent for rats, but to a lesser degree than for chicks. Thus 
the situation would be of even greater complexity than Dr. Irwin believed. 

Dr. J. W. Trevan said that it was a great pleasure to hear this paper. He hoped that statis- 
ticians would bear in mind that the vast majority of these quantitative assays, both research and 
routine, were still carried out by people who belonged to that 90 or 95 per cent, of the community 
who found it difficult to think in algebra, and a little more elaboration from the statistician as to 
what was being done would be welcome. The bestowal of the mathematical gene in human beings 
was very rare. 

He wished to raise one question which had bothered him. What was the effect of the dis- 
continuity which became increasingly pronounced with smaller groups? Suppose one wished to 
estimate the error of determination of the value of b by an experiment on two groups of 8 animals 
which had given a mortality of 1/8 with one dose and 7/8 with another dose I log. unit higher. 
The figure he got, using the ordinary probit method, was that the probability was about 0 06 
that h differed significantly from zero. If, however, he dealt with the figures by what Fisher 
called the “exact method” he got p - 0 005. 

He laid stress on this because groups as small as these were in fact efficient for certain biological 
assays, particularly those in the serological field, and he threw it out as a suggestion that perhaps 
someone might find an opportunity of dealing with this point. 

Dr. C. W. Emmens said that everyone whose business it was to perform biological assays 
should read Mr. Finney's paper, if only to impress on himself the depth of his ignorance. This 
communication was, of course, written for statisticians, and the author made that clear at the 
outset. He was sure that the author realized that the majority of people actually doing these 
assays needed much more in the way of detailed explanation, and needed also to be told more 
about what to do in practice. These remarks were not, however, intended to cast any reflection 
on a masterly treatment of the statistical position. 

Two main points struck the speaker as needing further elucidation by the statistician. The 
first concerned the equalizing of variances by transformations of the response. In many assays 
we had a satisfactory linear relationship between dose or log.-dose and response, but the variance 
was not constant at all levels of response. If we equalized the variance, we had a curved dose- 
response line. Now it might be argued that we could always choose a narrow enough range of 
doses to make the curvature unimportant, but we could not always, in practice, guarantee to hit 
this range, nor would the estimate of the slope be very precise with a very narrow range. In many 
assays, time to time variation in response was such that, even with a wide range over which the 
dose-response line was straight, we sometimes overshot or undershot this range, and the narrower 
It was. the more trouble we should encounter from that source. So would the statisticians please 



1947 ] 


Discussion on Mr. Finney's Paper 


87 


consider the relative merits of metameters which gave a wide linear dose-response range and 
unequal variances at different response levels, and those with a narrow range and equal variances 
— remembering the increasing use of such designs as the Latin square in these assays and the 
trouble one had if some responses had to be rejected. 

The second point concerned quanta! responses. When dealing with a graded response, we 
could take advantage of the greater similarity in the responses, of, say, litter-mates, or of the 
same animal at different times, as compared with responses from different animals. We arranged 
the assay so that differences between litters or animals could be segregated in an analysis of variance. 
How could we take advantage of this similarity of response when dealing with quantal responses? 
Some quantal assays did not, for instance, involve death of the animal, and it could be used 
repeatedly. We could, therefore, do cross-over tests as in the assay of insulin by the rabbit 
method in which a graded response was used. If we did this, estimates of differences between 
standard and test, of the slope, etc., which depended in part or whole on sums of differences of 
responses of the same group of animals at different times, would have a lower variance than would 
be the case had we not used a cross-over design. The statistical problem was, therefore, to estimate 
the correlation between responses from the same animal when these responses were quantal and not 
quantitative. In caSes where a cross-over test was not possible, the equal distribution of litter mates 
between dosage groups should help to guarantee a more precise estimate of potency than would 
otherwise be possible. Once more, how do we take advantage of this in analysing quantal assays? 

Mr. A. L. Bacharach said that he was neither a statistician nor a biologist but a mere chemist. 
Many years ago he was pitchforked into a position where he had to learn some biology; from that 
he was pitchforked into a position where he had to learn how to use and interpret statistics. He 
would like to say there, what he had said before in writing, that the amount of assistance he had 
had from statisticians — a number of them were in that room — had been very great and had always 
been given with the utmost willingness and with a patience which he thought could not be exceeded. 
He remembered being with a medical colleague who asked a medical man from across the Atlantic 
during the war, “May I pick your brains?” The American visitor replied, “You are welcome 
to anything you can find there.” The same spirit had been shown in this field. 

Although it would be quite impertinent for him to discuss either biology or statistics after 
making this disclaimer, and certainly to discuss the merits of the paper, which he hoped gradually 
to absorb, he wished to call attention to a question arising out of one of the author's obiter dicta. 
He had pointed out, as others had done previously, that the most efficient distribution of subjects 
for test, for the maximum precision of potency ratio, had sometimes to be modified because of 
the need for also establishing the validity of the assay by tests of linearity and parallelism and so 
on. Consequently it was often necessary to distribute the animals in such a way as to run some- 
thing more than a risk — almost a certainty — that some of the responses would fall outside the 
range of linearity. If he might borrow an analogy he thought there was in assays a “time-number 
continuum"' and one could increase the time with decrease of numbers or increase the numbers 
with decrease of time! Unless time was the essence of the contract (and the assays took at least 
several days) it might be more economical and ultimately result in a smaller number of calculations 
if one cut the Gordian knot in a different way and carried out the work by first undertaking a 
preliminary assay over a wide range of doses for a shorter time — even for a much shorter time — 
than the ordinary period of the assay. It was possible by taking, say, three doses in the ratio of 
1 to 4 to 1 6, to carry out a preliminary vitamin A assay for a week only, instead of three, and 
to get therefrom sufficient information to enable one to undertake subsequently a valid four- 
point assay, in which information from all the animals could be used. 

These preliminary investigations — probe-tests, as he called them — seemed to be an integral 
part of assay procedure when dealing with a substance the potency of which one could not foretell 
with any degree of accuracy. It would be of assistance to have advice from the statisticians on 
how such probe tests could most economically be carried out, and to what extent the data supplied 
by them could be incorporated with the results obtained from the subsequent four-point or other 
kind of routine assay. 

With that observation on a purely practical point he desired to say how much he had appre- 
ciated the invitation to come to that meeting and the value of the paper and the discussion. 

The vote of thanks to the author was put to the meeting and carried unanimously. 

The following contributions were received in writing: 

Dr. O. L. Davies: Mr. Finney’s publications on biological assays have been of considerable 
value to those who are interested in the statistical interpretation of these assays, and the present 
paper represents another valuable contribution. It illustrates very forcibly the advantages gained 
in applying statistical methods to biological assay. The advantages lie, not only in the calculation 



88 


[No. 1, 


Discussion on Mr. Finney's Paper 

of activities and their precision, but also in the design of investigations undertaken to det^mine 
the conditions of the assay which give the most satisfactory result, and ultimately, in the design 
of the assay itself. Any method of assay which is likely to be used repeatedly requires at the ou^t 
or in the early stages to be examined fully for at least all the main sources of error which an^, 
and in such investigations there is abundant scope for the application of the statistical principles 
of “ planning of experiments.’' Attention to these matters may well result in major economies 
in the long run and in greater precision per unit of effort. The statistician should be called in 
at the earliest possible stage. . i. ^ • 

In most biological assays involving a growth factor, the extent of growth varies with the time 
allowed for growth and also with other conditions of the experiment, e.g. temperature of incubation 
for microbiological assays. The slope of the response-log dose curves often depiends on the 
time allowed for growth and on temperature of incubation. Other sources of error being constant, 
the highest precision is given by the conditions which give rise to the highest slope. If the highest 
precision is given by an inconveniently long time of incubation, then the loss in precision by a 
shorter incubation time may have to be compensated for by a higher degree of replication. The 
relation between precision and time of incubation will enable the optimum economic conditions 
to be derived. Other conditions can be tackled in a similar way. 

In chemotherapy we frequently require to compare the activities of several compounds. It is 
necessary to compare the regression of the various compounds in order to assess whether or not 
the actions of the compounds are similar, and when the regression of response on log dose is 
linear (this is usually the case over a sufficiently wide range of doses) the analysis of the responses 
may be conveniently expressed in the following form : 

Source of variation Sum of Squares Degrees of Freedom Variance 

Between regression coefficients 

About individual regressions 

About common regression . . 

Within doses .... 

Total 

From this we can readily assess the significance of the variation between regression coefficients 
and the variation about the regressions. 

This leads to my next point. I find myself in disagreement with the previous statement 
that a significant variation about a linear regression, when compared with the variance amongst 
individual responses for a fixed dose, is necessarily an indication of a departure from linearity. 
This would be the case if the assay had been carried out under conditions of complete randomization. 
For routine work this is usually impracticable, and any attempt at complete randomization would 
probably introduce mistakes in dosing. In practice the subjects are usually dosed or treated in 
groups, and an apparent significant variance about a linear regression may be due to an error 
arising between the groups. Errors in dosing or preparation of the dose common to the individuals 
of each dose would also give rise to an apparent variation about the linear regression. The variance 
within doses would then be an underestimate of the error variance. This might be the case in 
the analysis given in Table 2. The variance about the regressions, although not significant, is 
more than twice the variance within doses. Unless the pairs of results for each dose are genuine 
replicates, i.e. prepared from independent weighings, dilutions, etc., and unless the dosing was 
carried out in a completely random order, I would prefer to take the variance about the linear 
regression as my estimate of error for this particular assay. In routine work there would be many 
such assays, and the question of the significance of the variance about the regression could then 
be tested more precisely by pooling the information from a number of these assays. 

Dr. Eric C. Wood :1am much impressed by the clear manner in which Mr. Finney has demon- 
strated the essential unity of the theory underlying all biological assays. Considering the widely 
varying types of assays that are in use for all manner of purposes to-day, and the apparently very 
different methods of computation used in calculating the results, this is a real achievement. I 
would not presume to cross swords with Mr. Finney on the purely statistical sections of his paper. 
The remarks which follow are concerned only with that part of his paper which deals with the 
design of assays. 

I think it should be emphasized — indeed, it cannot be too often proclaimed as a cardinal 
principle — that assay design is a matter requiring the close co-operation of the statistician and the 
experimenter. It is impossible for either to decide by himself what is the best experimental design 



^947] Discussion on Mr, Finney's Paper 89 

to use in any given set of circumstances. Theoretical and practical considerations interact to a 
very material extent; moreover, it is unfortunately true that they are often in conflict, so that the 
assay design eventually adopted has to be in the nature of a compromise. A good example of this 
necessity for compromise is encountered when endeavouring to decide how many different dose- 
levels of the Standard and Test Preparations to use. Mr. Finney has pointed out that an assay 
design in which only two dose-levels of each preparation are used is inferior to designs in which there 
are three or more, since in the former case the linearity of the response curves cannot be examined. 
This, of course, is perfectly true, and wherever it is practicable to do so, the 6-point assay design is 
definitely preferable to the 4-point design, but there is a very real practical difficulty. More often 
than not it is desired to assay more than one Test Preparation simultaneously, and in a very large 
number of assay techniques, particularly assays of vitamins and other growth factors using rats, 
inter-litter variance is found to be highly significant. It is therefore very desirable to distribute 
litter-mates evenly between one’s dose-groups. If, therefore, two Test Preparations are to be 
assayed, it is not possible to have three dose-levels of Standard Preparation and of each of the 
Test Preparations unless one has available several litters of at least nine animals, and in our colony 
(and I think I rnay say in those of other people as well) such large litters are not readily forthcoming. 
Consequently, in order to be able to use litters of six or seven animals, one is driven to adopt an 
assay design in which there are only two dose-levels of each preparation. This kind of difficulty is 
greatly intensified in assays such as those of Vitamin A, in which the responses of the two sexes 
are so different in degree and kind that it is almost essential to work with animals of one sex 
only. 

I also endorse the desirability of so choosing the doses of one’s Test Preparation that the mean 
responses obtained are as nearly as possible equal to those obtained from the corresponding doses 
of Standard Preparation. Mr. Finney has advanced one reason for this, namely, that the precision 
of the assay is increased, but there is also the important point that if the responses to the doses 
of Test Preparation are materially removed from the responses to the Standard Preparation, one 
or more of the Test Preparation doses may be outside the range of linearity of the assay technique. 
In an assay design in which there are three or more doses of each preparation, it may be possible 
to detect from the internal evidence of the assay itself that this has occurred. If, however, there are 
only two doses of each preparation, it may not be possible to distinguish between invalidity caused 
by a real qualitative difference between the Standard Preparation and the Test Preparation in their 
action on the Test animals, and apparent invalidity caused by one of the Test responses falling 
outside the range of linearity. It is, of course, not always possible with unknown materials to 
guess sufficiently shrewdly the correct doses to use, but it should certainly be taken as a working 
principle that the result of any assay in which the Test responses are far removed from the Standard 
responses should be regarded as suspicious, and it is highly desirable in such an event to carry 
out a second assay in which the Test doses are adjusted in the light of the experience gained in the 
first assay so as to be more nearly equivalent to the Standard doses. I must admit, however, 
that this is a counsel of perfection. A single biological assay may well involve the use of 40-60 
rats for several weeks ; and unless one has a gigantic colony at one’s disposal, one is often forced 
to omit repeat assays and other investigations which are clearly desirable because one simply 
has not got enough animals available for the purpose. This is one of the strongest arguments 
for employing statistical methods to ensure that one uses in one's assays just enough animals to 
obtain the required degree of precision, and no more; it is in this direction that the application 
of statistics can return a very real dividend in hard cash. 

It is greatly to be wished that the official and semi-official bodies which are responsible for 
producing standard procedures of biological assay would pay at least as much attention to the 
method of computation of the result of the assay as they do to the description of the practical 
details of the assay itself. It is very surprising, and indeed deplorable, that so many published 
assay pi ocedures are seriously at fault in this respect. I will not comment on the assay methods 
of the British Pharmacopoeia (1932) because after all this is fifteen years old, and I understand 
that the new Pharmacopoeia which is soon to be published will leave little to complain of from 
the statistical aspect. The United States Pharmacopoeia, however, which is much more recent, 
uses assay designs for Vitamins A and D in which there is but one dose of Standard Preparation 
and one dose of Test Preparation — a 2-point assay, in fact— and the assay is regarded purely as 
a limit test, no attempt being made to compute the actual potency of the preparation. Even 
more recent is the 1946 edition of the Official Methods of Analysis of the American Association 
^f Official Agricultural Chemists. It is most disappointing to find that the method given for the 
biological assay of Vitamin Bi by the rat-growth method contains many careful details designed 
to ensure the physical similarity of the groups on the Standard Preparation and the groups on the 
Test Preparation, but that when it comes to the examination of the results, it is stated only that 
if the average gain in weight in a Test Group is equal to or greater than that in a Standard Group, 



90 


[No. 1, 


Discussion on Mr, Finney's Paper 

then the Vitamin Bi content of the total Test material fed is ^ual to or greater than that of the 
total Standard material fed. It seems extraordinary that this is the best that can be done in the 
1946 edition of a book published under the auspices of an Association which could command the 
services of the best statistical brains of the United States. 

Mr. Finney, in reply, said that he was grateful for the kind remarks made by Mr. Fieller, 
Dr. Irwin and other speakers. He felt honoured by the presence of some whose names are illiw- 
trious in the history of biological assay, especially Sir Percival Hartley and Dr. Trevan, though he 
feared that, from their point of view, his paper must have seemed a riot of mathematics, with little 
relevance to assay practice. He proposed to answer some of the comments more fully in print, 
but a few points he would deal with immediately. 

As a mathematician, he shared the views of Mr. Fieller and Dr. Irwin that the addition of 5 to 
normal and other equivalent deviates was unnecessary. On the other hand, the probit was now 
so firmly established that an attempt to revert to the normal equivalent deviate seerned scarcely 
worth while, and, however little advantage the addition of 5 might give, it seemed to him desirable 
to define logits or other similar transformations according to the same convention. He would 
not venture to join issue with Dr. Irwin on nomenclature, and would plead that he himself had 
not invented the words referred to. 

With reference to Mr. Fieller’s comment on equation (27), he hoped that the misprints would 
be removed before publication. The form of the equation seemed to him a matter of personal 
preference ; he had found his version both convenient to compute, and indicative of the extent 
to which an approximation based on equation (24) could safely be used. 

He had b^n interested to see Dr. Irwin’s discussion of the four-point assay, but, like Mr. 
Fieller, he felt that the proijerty of giving the same result for a parabolic as for a Knear regression 
was a “ mathematical fluke.” He considered that very great care should be exercised in the 
interpretation of results from assays of this design. 

Mr. Gridgeman had enquired about the use of unequal numbers of subjects at different dose 
levels. The effect of this on the precision of the assays could be investigated in the manner of 
Section 7 and the Appendix, but he doubted whether this would lead to improvements in design 
that could not be predicted from the Tables already given. Full tabulation would be very 
laborious, but he would be glad to examine any more definite suggestions that Mr. Gridgeman 
might put forward. 

Though he did not claim to be a geneticist, he had always denied the existence of the mathe- 
matical gene, to which Dr. Trevan had referred. He thought that with a little patience almost 
anyone could become a competent mathematician, whereas biochemistry or pharmacology re- 
quired something akin to intelligence. Nevertheless, in writing for the biologist, a statistician 
encountered the difficulty of guessing how much familiarity with mathematical and statistical 
technique he could assume in his readers. The present paper endeavoured to survey a wide field ; 
in order to keep it within reasonable length he had written primarily for statisticians, and had not 
given details of practical applications of illustrative examples as fully as he would have liked. 
He would like to assure Dr. Trevan, however, that he shared his belief that the statistician should 
not be content to give expositions of statistical theory, but should also endeavour, on suitable 
occasions, to make the essentials of his theory intelligible to the users of statistical technique. 

He hoped for future discussion with Mr. Bacharach on “probe-tests” and the time factor in 
assays. He had not met that particular type of problem before, but obviously it was important 
to the most ^onomical use of material. Possibly recent developments in sequential sampling 
techniques might be adapted for use here. 

In preparing the paper, his outlook had been influenced by the wide range of problems in 
which the statistical methods develop>ed for assay purposes had been applicable, even though 
these were not biological assays in the strict sense. He recalled that recently, on one day, four 
requests for advice, from different Departments of the University of Oxford and on problems 
otherwise unrelated, had been best answered by reference to quantal responses and the probit 
method. 

Mr. Finney later added the following remarks in writing : 

I fully agree with Dr. Irwin’s comment on the behaviour of x* for small numbers. I fear 
that I expre^d myself rather loosely in my paper, but I have discussed the point more fully 
elsewhere (Finney, 1947). The statistic calculated from the data has the same exr)ectation as a 
true X*, but its sampling distribution has a greater variance ; it seems to me, however, that only 
the larger values need careful watching, since the others are unlikely to lead to faulty conclusions. 

The use of a two-variate regression equation for slope-ratio assays, which Dr. Irwin ri^tly 
admires, was first proposed by my friend Dr. E. C. Wood. I have suggested to him that additional 



1947] Discussion on Mr. Finnefs Paper 91 

information relevant to the validity of an assay might be obtained from tests of mixtures of doses 
of the two preparations, which should ^ve responses corresponding to other parts of the regression 
“plane,” but I have not yet found a biochemist with time to spare for making trial of this. 

I cannot agree entirely with Mr. Fieller’s claim that equalizing the variance is more important 
than rectification, though, as I have said in a footnote on p. 5 1 , 1 do not think our points of view 
completely opposed. Dr. Irwin pointed out (as I had of course realized) that the analysis for 
Quantitative responses can be modified so as to take account of unequal variances, but his proposal 
breaks down unless the relative magnitudes of these variances are known. When a* is clearly 
dejii^ndent upon (/, I would prefer to follow Mr. Fieller in seeking first a transformation of response 
which will give a constant variance, but 1 should want to follow this with a further transformation 
according to the rules of my paper, unless linearity were obtained, by chance, at the same time as 
the constant variance. Non-linearity may easily result in biased estimation of potency, whereas 
neglect of a variation of with U is unlikely to have consequences worse than wrong assessment 
of precision — a less serious fault. 

Perhaps one reason why Mr. Fieller and I differ in outlook is that he has had considerable 
experience of routine commercial assays, whereas my interests have been more restricted to research 
problerns. I think that both points of view are important, though they may lead to different 
conclusions on such questions as the best form for computations and the economics of the division 
of labour between experimental and statistical work. For this reason, 1 have not attempted to 
discuss recent American work on the use of control charts in routine assays, and on condensed 
computing techniques for assays of standard design. In spite of what Mr. Fieller has said, 
T believe that graphical methods can be very useful. I would always encourage the research worker 
to draw his dose-response diagram. The experienced worker can often draw lines by eye that will 
give him an estimate very close to that he would obtain by calculation, and which, as he can see 
by inspection, lies within his limits of tolerance for precision. For many purposes an exact 
statement of hducial limits is not needed, providing that there is clear evidence that they are not 
wider than a specified amount. Furthermore, the diagram may draw attention to systematic 
departures from linearity that could pass unnoticed in a purely mechanical calculation. I should 
not like my remarks on graphical methods to be interpreted as excuse for lack of care and rigour 
in statistical analysis, but I do suggest that graphical techniques may properly be employed when 
they not only save time, but also can be seen to give results indistinguishable from those of a full 
analysis. In any case of doubt, I would insist on the arithmetical procedure, and 1 have no wish 
to discourage any analyst who is prepared always to undertake the arithmetic. 

Dr. Emmens’s question about the use of litter-mates in assays based on quanta! responses is 
very interesting. 1 have never encountered an assay in which this control of variation was employed, 
but I a^ee that it ought to improve the precision of estimation. One method of performing the 
calculations would be to introduce concomitant variates Ci, Cs . . . such that Ct - I 

for all members of the litter and zero for all other subjects. A multiple co-variance analysis 
on the Cl, followed by adjustments based on the regression equation, would then remove the com- 
ponent of variation attributable to litter ditferences. This procedure, however elegant mathe- 
matically, would be arithmetically very laborious, and the problem merits closer examination 
from the practical point of view. 

1 am glad that Dr. Davies has drawn attention to the danger that lack of randomization in the 
testing of different dose-levels may be responsible for an apparently significant deviation from 
linearity. Assays that I have examined since the presentation of my paper suggest that appreciable 
underestimation of error variance may frequently arise from this source. Dr. Davies's recommen- 
dation to use deviations from the linear regression to give an estimate of error variance should 
overcome the difficulty, providing that sufficient degrees of freedom are available to make this 
estimate worth having. I should like to see experimental investigation of this point made in a 
number of laboratories, as the personal factor may be important and theory cannot predict the 
results of incomplete randomization. 

Dr. Wood has mentioned the difficulty of using litter-mate control of variation when several 
preparations are to be assayed simultaneously, so that the number of doses of all preparations 
exceeds the size of litter normally available. I deliberately avoided discussion of simultaneous 
assays, so as not to complicate my argument still further ; they do not, 1 think, introduce new points 
of principle, though they may require modifications of design for optimal efficiency. Dr. Wood’s 
problem may be solved by the adoption of the incomplete block designs which have been found 
suitable for dealing with analogous difficulties in field experimentation. The balanced, and perhaps 
also the partially balanced, incomplete block-schemes, in which restrictions of symmetry are placed 
on the number of litters providing comparisons of any pair of doses, seem particularly appropriate, 
and doubtless special variants of the symmetry conditions would ensure maximum precision for 
the most important contrasts. 



92 


Moran — The Random Division of an Interval 


[No. 1, 


The Random Division of an Interval 
By P. A. P. Moran 

In a recent address to the Royal Statistical Society^ Professor M. Greenwood has proposed 
the following problem, which is of interest in providing a test of significance in the statistical 
study of infectious diseases. Let n points be distributed at random in the unit interval (0, 1), 
thus forming n + I intervals 7^, . . . j What is the statistical distribution of — 

n I 1 
i I 

In the present paper we discuss this problem. 

The theory of various distributions connected with the random division of an interval has 
been discussed by many writers (Clifford,* v. Bortkiewicz,® Morant,^ Fisher,® Garwood," and 
others) and has many applications to statistical problems. Both Clifford and Fisher remark 
that this problem is equivalent to considering n f 1 quantities x^~ \f\ which are rion-negative 
and satisfy— 

n I t 

^ ~ I, 

and whose probability distribution is such that equal areas on the positive part of the //-dimensional 
plane defined by this equation correspond to equal probabilities for the corresponding divisions 
of the interval. 

The part of this plane common to all the half-spaces > 0 is an //-dimensional regular 
simplex. S lies between unity and the square of the perpendicular from the origin on to the 

n4 l 

plane ^ ^ 1. The perpendicular is of length (// 4 - l)-L When S is greater than this the 

w 1 1 

(// f l)-sphere defined by S --- constant, intersects S Xi ^ I in an //-sphere of radius r where 

r‘“ S - (// I 1)-^ The problem of finding the distribution of S therefore reduces to that of 
finding the volume common to a regular //-simplex and a sphere of varying radius whose centre 
is the circumcentre of the simplex. 

This representation of the problem gives us a certain amount of information about the nature 
of the distribution. The simplex is bounded by // + I (// * l)-dimensional planes (the faces), 
Un f I)// “ edges ” of dimension // -- 2, J n{n 4- !)(// 1) “ edges ” of dimension // - 3, and 
so on down to // H' I zero-dimensional vertices. When r is small the //-sphere lies entirely within 
the simplex. As r becomes larger than the radius of the in-sphere, the sphere becomes intersected 
by the (n - l)-planes and the analytic form of the distribution changes. It changes again when 
the sphere becomes large enough to intersect the (// - 2)-dimensional “ edges,” and so on. Thus 
the distribution is analytic in // stretches, and at the ends of these the higher derivatives arc dis- 
continuous. 

We may also consider the distribution in the following way : Let g^C/*) be the volume of the 
sphere cut off by one of the (n — 1 )-planes in which the faces of the simplex lie, ggC'') volume 
cut off by two of these planes, and so on. gi(r) will be zero until r is large enough to intersect 
the element of the boundary of dimension n — /. Then the volume of the region common to 
the sphere and the simplex wilt be — 

J TT/*® 4- S(« ^ 0(-)* gi(r). 

the sum extending over all non-zero gi(r)’s. When gi(r) is greater than zero it is an increasing 
analytic function and not all of its derivatives are zero at the point where it is zero itself. 



1947] 


93 


Moran — The Random Division of an Interval 

It might be possible to calculate the exactly by using /t-dimensiona1 hyperspherical 
trigonometry but this would be difficult. However, it is possible to express the distribution in 
the Az-dimensional case in terms of the result for (/i — 1) dimensions. To see this write v,t(r) for 
the volume common to an /i-sphere of radius r and an w-simplex of unit side and the same centre. 
Then writing <I>n(jf) for the probability that 5 < jc, we obtain by a little calculation that — 

<1>„(JC) = 2-»" /I ! (/I + ir» V„ • 

With each of the w f 1 faces of the simplex the centre forms an /7-dimensional pyramid. Let 
Pn be the radius of the in-sphere of the simplex. It is easy to see that — 

pn* = {2/f(/7 + 1)}”^ 

Consider the part of the sphere of radius r which lies inside one of these pyramids, and take an 
(n — l)-plane parallel to the base of the pyramid and distant x from the centre. The (// — l)- 
dimensional measure of the part of this plane lying inside both the pyramid and the sphere is — 

* V„ _ , (p„X" VC'-* - •«*)) 

where we write v„ _ i(r) 0 when r is imaginary and v„ _ ,(r) ^ 1 when r is greater than 
[2n(n - 1)“*]^*, the circumradius of an (n — l)-dimensional simplex of unit side. It follows 
that, when r is less than [In- \n -|- 1)]~1, the radius of the circumsphere — 

Vnir) (n -H (xp„- *)” *1',. _ .(»„■><■ " VC'’* - x^))dx 

rao -«4-2 

-- (n 4- l)r« / (Ph* -I- t^) 2 f 

JV{r*-pn^) 

and v„{r) 1 when r is greater than this circumradius. 

Using the above formula it would be possible to evaluate the distribution for the smaller 
values of r, starting from the solution for n -= 2, which is easy to calculate by elementary geometry 
(given by Greenwood*). However, there is a loss of accuracy at each stage, and the process 
would probably not be suitable in cases where n is greater than 10. We also remark that for 
any n the distribution is easy to calculate when r is less than the radius of the in-sphere of the 
simplex. This was done by Isserlis (Greenwood*), and had been done previously by Clifford.® 

The Moments of the Distribution 

In what follows it is convenient to cast the problem into a slightly different form. Let 
;ci, . . . A-n i I be quantities independently distributed on (0,co) with the probability function 
p Then the joint distribution of the quantities Xt(A:i H . . . 4 - a:„ 4 i)~* where / = 1, . . . n-\-\ 

is equivalent to the joint distribution of the lengths |/i|. This was pointed out by Fisher-’ and 
had been previously proved by Clifford. Fisher was concerned with the distribution of the 
largest of the intervals, whereas we wish to discuss the distribution of the sum of squares, that is, 
the distribution of— 


(^1* 4- . . . 4- Afn 4 i)(^i 4- . . . 4- A-;, ^ i) ®. 

We write A for the region common to all the half-spaces ari>0, B{T) for the part of this region 
cut off by the “plane” a-i 4- . .. . j- a-,, ^ T, and B for B{\). The moment (jl/ about the 
origin is then given by — 

- p"' ' / (x.» + . . . + + . . . + x„ + ,) " 1></Xx . . . dx„+. 


n+ 


^ P - 


lim J (aj® 4- 

r->oo B(T) 


An 4 - i)'(Ai 4- 


>.-2r -/3(r, I- 


• 4- 



94 


Moran — The Random Division of an Interval 


[No. 1 


and writing— 

^Tyi (i = 1. . . . /I + 1) 

(X/ - p ./O', + 

r->oo fl(r) 


• ''n+j) 

dyi. . . dy„ + i 


Using Dirichlet’s integral (Whittaker and Watson,® p. 258) we get — 

|X/ -- p"^'* lim S - , . ^^“4’ Z iff "- ' Z*'^" ^ 

«i !...<*« + I ! (« + 2r) ! J« ^ ^ 

7’— >• QO 

where the sum is taken over all partitions of r into non-negative integers aj. Then we have — 


r!(2a.)! . . . (2a„ ^ ,) ! , _ 

p/ = lim X p» + 1 / r"+iT»e-^' W t 


,/r 


T ->00 ai ! . . . a„ _|^ , ! (w 4- 2r) ! 

r!(2a0! . . . (2a^ -^ ,)! r^T 


-= lim S 


r ->oo at ! . . . a„ ^ t ! (|| 4- 2r) ! 

/ilr! (2at)!. . . (2a^ ^ 


t^e~‘dt 


(« 4- 2r) ! at ! . . . a„ 4. 1 ! 


For w>3 we obtain— 

jxt' = 2{n 4- 2)“- ‘ 

tx,' =-• 2*(« + 6){(n + 2)(« 3)(n + 4)}- ' 

(Xg' 2»(n’ + 17n 4- 90){(« -1- 2)(« + 3)(n -f 4 )(h + 5)(« + 6)} ' 

1x4' = 2*(/i’ + 33w* + 434n + 2520){(« + 2)(n 3)(« + 4)(xi 4- 5)0» + 6)(« 4- 7)(/i 4- 8)}“* 

and reducing to the mean as origin — 

tx, = 2«/f(« + 2)-{(« 4- 3)(/i 4- 4)}- ‘ 

(x, -= 2>(10n» - 4)(« 4- 2)-‘'{(rt + 3){ri 4- 4)(/j 4- 5)(// I- 6)} “ * 

tx^ = 2‘(3«* 4- 303«* 4- 42n* - 24»i){« + 2)-‘{(rt 4- 3) ...(/» 4- 8)} " ‘ 

A Non-linear Central Limit Theorem. 

We now prove that as n becomes large, the above distribution can be expressed asymptotically 
in terms of the normal distribution, as we would naturally expect from the asymptotic behaviour 
of the above moments. To do this we use the following theorem : 

Theorem : Let the quantities Xi (i 1, . . .) be independently distributed in a 
distribution law which has its first four moments finite and > 0. Write 

1 “1 



Moran — The Random Division of an Interval 


Write ai* for the variance of x,*, and ag*, say, for the variance — Jijj'* 

of Xi* which will have a mean m — Then the quantity— 

ti = m- W - 2x<(pti')-' ^ -f 1 
has zero mean and a variance — 

V - - 4{X3'(/wixV‘'' + 4- m- 0 - 3, 

and the distribution of — 

0=^ niv- w 1 y 

L J 

tends, as n increases, to a normal distribution with zero mean and unit variance. 


Proof : 


== < nm + /licTj 


— nm 


/ifx,' + wlai- 


•Si” - nll^' 


= /ww" Vi'^’ S 1 4- 


— nm) 




Now if — 1, (14- ~ 1 — 2y Oy^ 

where— |0| ^ \(2y -f 3)/(l 4* y)*\ < 8, 

say, whenever \y\ < 4~^ Not assuming anything as yet about the denominator in equation (1) 
except that it be non-zero, we have — 


m * (S,n)^ 


a,(5a»» — /7m)T f ~ /ifx/) - nix^y 


niLi^Sz^ 


iSg** — nm 2(5j” n[ii) 


(5i« - fW)^S2^ 


- nm){S^^ - zifi/) 


and so — 


{Si**- nni)iSi** — n[ii) 


n^ • n\Xx Si** 
V m{Si**)^ 


{Sx^ ~ nii^rSi** 


- 1 = (vy/n) -^i:ti-2A + OB, 


A = {Si** — nm){Si**— w|x,')(«^v/n{ji/)- 


B = (5,» - /»n,')*5a-C/«4vwui'*)- *. 



96 


Moran — The Random Division of an Interval 


[No. 1, 


Now, given any number a, we show that the probability that (p <, a converges uniformly in 
as n tends to infinity, to — 

(2n)-i je-V*dt 

Let f, 1 / < be given positive constants. By using Tchebycheff’s inequality we can choose 
K so large that the probabilities that — 

\S 2 ^ - ~ nm\ > Kn^ni 
- n\x^^\ > 

are each less than Jt, independently of n. Now choose N so large that — 

^ II nhmyii 

n 

for all n > N, and so that the probability p(t) that n-^ I. ti t satisfies — 

1 

I ,,(/) - (27t) 1 fe ^^dx\ 

for all t and all n '■ N. We can do this by applying the ordinary central limit theorem to 

n 

/f, because the ti are distributed in the same distribution independently and this distribution 

1 

has all its moments finite. 




1947] 


97 


Moran — The Random Division of an Interval 

This proves the theorem which we can now apply to our problem. For consider quantities 
XiQ = 1, 2, . . .) each independently distributed with the probability density function 
TTiis distribution has all its moments finite and we apply the theorem to — 

-f . . . -f. + . . . 4- 

It follows that — 



tends, as n increases, to be distributed normally with unit variance about zero mean, i.e. that 
52 ”(*S'i*») * tends to be distributed normally about mean ln~ ' with variance 4/z”®, quantities which 
are asymptotically equal to the exact values calculated above : 

Although the distribution does tend to normality ultimately, this happens rather slowly, as 
is shown by the following values of and p* : 


n 


1 p* 

5 

1*59 

■ 6-83 

10 

1-71 

i 8*35 

1 

50 

1*22 

6-39 

100 

•93 

' 5.03 

1000 

•31 

3-24 


The method of proof above enables us to prove more general results about the products and 
quotients of symmetric functions Sj^{r - - 1, 2, . . .) of quantities xi which are all independently 
distributed in the same distribution provided either that, if = 0, only symmetric functions 

of even order occur in the denominator. Moreover the moments of the distribution of the Xi 
must be finite up to an order that will make the method of the proof work. By using the full 
force of the central limit theorem itself we can naturally relax the above conditions somewhat. 
We also remark that v. Mises,®»^® has considered generalizations of the central limit theorem to 
non-linear functions of n independent quantities, but his theorems seem to be of a different nature 
from the above. 

Supposing for the sake of simplicity that all the moments of the distribution of the Xi are 
finite, we can then deduce 4he following results. The sampling distribution of the statistics bx 
and hz, the studentized ” moment the cumulants, and the coefficient of variation, 

calculated for a sample of n, all tend to normality when n tends to infinity. These results are 
well known but proofs seem lacking in the literature. A very crude upper bound for the divergence 
of the distribution of these quantities for finite n from the normal can be found by applying 
Liapounoff’s theorem (Cramer, “ p. 77) and Tchebycheff’s inequality in the above proof. 

The normal function also occurs in other problems of mensuration in a large number of 
dimensions. Borel“ (see also Castelnuovo*®) has shown how to calculate approximately the 
volume of the part of an w-dimensional cube cut off by a plane distant d from the centre of the 
cube and. not parallel to any of the faces. This, of course, is equivalent to applying the central 
limit theorem to a weighted mean of a sample of n from a rectangular population. The result 

SUPP. VOL. IX. NO. 1. 


H 



Moran— r/f^ Random Division of an Interval 


98 


[No. 1, 


has applications to the asymptotic theory of the number of representations of a number as a 
linear form. 

Similarly we may consider the volume common to a unit cube and a sphere in n dimensions. 
Write this C(r) where r is the radius of the sphere. Then when n is large we can represent C(r) 
approximately in terms of the normal distribution function by using the central limit theorem 
on the sum of squares of the quantities which arc considered to be each distributed in a rectan- 
gular distribution. It follows that for large N, we can represent* after a little reduction, the 
numbers of partitions of numbers less than or equal to N into the sum of n squares each less 
than or equal to a number A/, say, in terms of C(r), where n is kept fixed and N and M increase. 
C{r) can itself be calculated with a relative error which decreases as n increases. 


References 

* Greenwood, M., /. Roy. Stat. Soc. 

* Cfijord, W. K. (1866), “ Solution to Problem 1878,” Educational Times, Jan. Reprinted in Mathe- 
matical Papers, pp 601 607. 

® V. Bortkiewic/, L. (1915), BuU. de Tlnst. Intern, de Statistique, tome xx, livre 2, pp. 30-101. 

^ Morant, G. (1920), Biometrika, 13, 309-337. 

Fisher, R. A. (1925), Proc. Roy. Soc. A., 125, 54. 

« (1940), Ann. Eu^tenics, 10, 14. 

’ Garwood, V. (1940), Suppl. J. Roy. Stat. Soc., 7, 65. 

" Whittaker and Watson (1927), Modern Analysis, Cambridge. 

* V. Miscs, R. (1935), Rev. Fac. Sci. Istanhouk 1, 61. 

Actualities Scientifiques, No. 736 (Hermann, Paris). 

j 2 Random Variables and Probability Distributions. Cambridge. 

... (19N)» Introduction Geometrique a queique theories Physiques. Paris 

Castelnuovo, G. (1926-1928), Calcolo della Probabilitd. Bologna, 



1947] 


99 


The Significance of Associations in a Square Point Lattice 
By D. J. Finney 

Lecturer in the Design and Analysis of Scientific Experiment, University of Oxford 

H. Todd (1940) has proposed an ingenious series of tests for examining the randomness of a set 
of points in a square lattice, to be used, for example, in determining whether deaths amongst trees 
planted at the corners of such a lattice occur independently or in “clumps.” He considered a 
lattice of mn points in n rows of /w, and found the probability that a random pair of them should 
be contiguous (including diagonal contiguity). In order to test the randomness of an observed 
set of (X points, he compared the expected and observed numbers of these doublets (using each 
point several times if it can form part of several doublets): the expectation is i {x ((x — 1) multiplied 
by the probability, P, for a single doublet, and Todd suggested that the sampling distribution on 
the null hypothesis of a random selection of points could be well approximated by a binomial 
based on P or, in the common situation that P was smafi, by the corresponding Poisson distribution. 

Todd then extended his results to triplets and quadruplets, sets of three and four contiguous 
points, again comparing the total numbers of these with their expectations; he stated that “since 
the probabilities in the case of triplets are smaller than in the case of doublets, the distribution 
will be of the Poisson type to a high degree of approximation.” 

The values for the expected numbers of doublets, triplets, and quadruplets are correctly derived ; 
the suggested tests of significance, however, seem liable to attach too low a probability to large 
deviations from expectation, through underestimation of the variance of the true sampling distri- 
bution. Except when (x is very small by comparison with the total number of lattice points, the 
discrepancy is likely to be more important, rather than less, for triplets and still more so for 
quadruplets. A test based on a binomial or Poisson distribution assumes that the separate 
occurrences of these contiguous sets are independent, whereas, even when the (x points arc chosen 
entirely at random, the sets are not independent of one another, since a single “clump” may con- 
tribute a large number. For example, if the {x points happen to contain a clump of nine, arranged 
in a 3 x 3 square, these alone contribute 20 doublets, 48 triplets, and 85 quadruplets. Conse- 
quently, though Todd's values for the expectations of the numbers of doublets, triplets, and 
quadruplets are correct, the expectations of the squares of these numbers may be much greater 
than for the binomial distributions. 

As an illustration, Todd's example of 3 points in a lattice of 3x3 (w^/7-^3, (x^3) may be 
examined. Direct enumeration of the 84 possible selections of 3 points shows the frequencies of 
0, 1,2, and 3 doublets to be 8, 28, 32, 16 respectively. Hence the expected number of doublets is 

F(^/)-(28-f64H 48)/84 
5 
3 

which agrees with the result calculated from equation (1) below. 

Also £(c/2) ™ (28+128+1 44) /84 

_ 25 
""^ 7 ’ 

and therefore E[d—E(d)y ~\ 

the binomial variance is so that the true variance exceeds the binomial by a factor of In 

general the difference between the two variances for doublets seems likely to be small, at least 
when (X is not very large by comparison with mn\ when (x is increased beyond \mn there must be 
a serious discrepancy, since the true variance will then decrease while the binomial variance 
continues to increase. But for triplets and quadruplets the difference may be of considerable 
practical importance, even for comparatively small values of (x, as will be shown below. 



100 


Finney — The Significance of 


[No, 1, 


Exact evaluation of the variances would be a very laborious process, involving enumeration 
of all possible types of configuration of [x or less contiguous points. In an attempt to form some 
idea of the importance of the departure from the binomial distribution, a sampling experiment 
was undertaken. A lattice of 100 points (w-/7=10) was chosen as a convenient size, the points 
were numbered from I to 100, and twenty of them were chosen with the aid of a table of random 
numbers (Kendall and Babington Smith, 1939). These were numbered in order of their being 
obtained, and the doublets, triplets, and quadruplets were then counted for the first 2, 3, 4, . . . 19, 
20 points. The procedure was repeated with new sets of twenty random points, until in all 100 sets 



Here there arc no doublets in the first four points, but the addition of 5 gives a doublet with 3 ; 
6 and 7 give no new doublets, 8 makes a doublet with 6, 9 adds no more, 10 adds two doublets 
(with 3 or 5), and so on until 20 adds the thirteenth doublet. A similar count of triplets gives 
none until 10 is included to make a triplet with 3 and 5; a second is added by 13, and 14 gives two 
triplets (with 3 and 10 or 5 and 10), and so on. A third count gives the quadruplets. Thus a 
sample of 100 was obtained with ja 2, 3, 4, ... 19, 20 and for doublets, triplets, and quadruplets ; 
the counts for successive values of (j. are not, of course, independent, but they nevertheless give 
valid estimates of the means and variances for each ja. 

Doublets. — Todd’s formula for the probability that a random pair of points form a doublet is 


p _ 2[^mn — 3 {m f n) + 2] 
run (mn — 1 ) 

- 0- 069091 form - n - 10. 


The expectation of d, the number of doublets in a random selection of points, is 

E{d) - 

and, if the distribution of d were the same as a binomial distribution of events with P as the 
probability of “success,” the variance of d would be ^C^P(l-F). In Table I, these quantities 
are compared with the estimates obtained from the sample. 


,2 _ S(d-dp 
- 99 * 

The standard error of d is :h A^t//10; in general, the agreement between d and E(d) may be judged 
satisfactory, though there is a tendency for lo be above expectation for the larger values of (a. 



1947] 


Associations in a Square Point Lattice 


101 


For values of tx greater than 7, the variance is less than that for the corresponding binomial, though 
scarcely sufficiently so to be considered significantly low. This last result was unexpected, and no 
reason can be suggested for the true variance of the distribution of d being below the binomial 
variance; the miost reasonable explanation seems to be that up to jx — 20 the distribution of 
d does not differ greatly from a binomial, and that the low variances in the experimental 
sampling are due to chance. 


Table I. — Sampling Results for Doublets 



E{d) 

d 

P) 

•> 

2 

•069 

•09 

•064 

•083 

3 

•207 

•28 

•193 

•305 

4 

•415 

•46 

•386 

•493 

5 

•691 

•78 

•643 

•880 

6 

104 

1-19 

•965 

•28 

7 

1-45 

1-52 

1-351 

•40 

8 

1-93 

2 02 

1-801 

•64 

9 

2-49 

2-60 

2-315 

•70 

10 

311 

3-30 

2-894 

•81 

11 

3-80 

3-91 

3-54 

2-16 

12 

4-56 

4-67 

4-24 

2-51 

13 

5-39 

5-60 

5-02 

3-82 

14 

6-29 

6-57 

5-85 

4-33 

15 

7-25 

7-62 

6-75 

4-90 

16 

8-29 

8-75 

7-72 

6-73 

17 

9-40 

9-94 

8-75 

7-71 

18 

10-57 

11 03 

9-84 

8-53 

19 

11-81 

12-38 

11-00 

8-92 

20 

1313 

13-86 

12-22 

10-57 


Triplets . — For triplets, Todd’s formula is 

p __ 6[20ww - 28 (m i- fi) + 36] 
mn {mn — 1 ) {mn ~ 2) 

-- 0 009128 for m = /i -= 10. 

The expectation of /, the number of triplets, is 

EU) -- 

and the binomial variance is ^C^P (1 ~P). The sample estimates arc compared with these in 
Table 11. Though t also agrees well with its expectation, the results for the variances are very 
different from those for doublets. Even with a small value of (x, the variance is appreciably above 
that for the binomial, and when (x 20 it is nearly five times the binomial value. Seriously 
misleading conclusions as to the significance of the association in an observed set of (x points 
might be drawn if the number of triplets were compared with its expectation with the aid of the 
binomial distribution or the nofmal approximation thereto. 

Quadruplets . — ^The probability for a single quadruplet is 

P = 2411 00 (m f /i) + 403] 

mn (w/i— 1) (mn—2) {mn—3) 

0 0017961 for w == /I = 10. 

The expectation of q, the number of quadruplets in (x random points, is 

E(q) - ^C^P, 



[No. 1 


102 Finney — The Significance of 



Table II .- 

—Sampling Results for Triplets 



E(t) 

t 

mCoP (1 ^P) 

»> 

3 

•009 

•03 

“ 009 

•029 

4 

•037 

06 

•036 

•118 

5 

•091 

•15 

•090 

•311 

6 

•183 

•26 

•181 

•699 

7 

•319 

•30 

•317 

•717 

8 

•511 

•46 

•507 

•776 

9 

•767 

•73 

•760 

1088 

10 

1095 

117 

1085 

1-839 

11 

1-51 

1-57 

1-49 

2-41 

12 

201 

2 08 

1-99 

3-85 

13 

2-61 

2-80 

2-59 

6-59 

14 

3-32 

3-65 

3-29 

7-46 

15 

415 

4-70 

412 

11-89 

16 

511 

5-88 

5 07 

17-3 

17 

6*21 

7 07 

615 

23-5 

18 

7-45 

817 

7-38 

30-7 

19 

8-85 

10 -( K ) 

8-76 

41-3 

20 

10-41 

11-97 

10-31 

51-4 


and the binomial variance is (I - P) ; the sample estimates of mean and variance are compared 
v^'ith these in Table III. There is a tendency for q to exceed its expectation, but this is probably 
due to the chance occurrence of several very high values. For all values of [i, the variance exceeds 
the binomial variance, and clearly the latter bears little relationship to the true variance of the 
statistic q; when |x 20, the experimental variance is eighteen times that for the binomial. 


Table \\\,- Sampling Results for Quadruplets 


1 * 


E{q) 

(t 

^'C\P{\-P) 


4 


•002 

•01 

•002 

•010 

5 


•009 

•02 

•009 

•020 

6 


•027 

•05 

•027 

•169 

7 


•063 

•05 

•063 

•169 

8 


•126 

•05 

•126 

•169 

9 


•226 

•14 

•226 

•384 

10 


•377 

•36 

•377 

1041 

11 


•593 

•58 

•592 

1 62 

12 


•889 

•93 

•887 

3-60 

13 


1-284 

1-41 

1-282 

7-58 

14 


1-798 

200 

1-795 

9-29 

15 


2-452 

2-99 

2-447 

19*59 

16 


3-27 

416 

3-26 

26-6 

17 


4-27 

5-44 

4-27 

430 

18 


5-50 

6-65 

5-49 

72-4 

19 


6-96 

8-92 

. 6-95 

118-6 

20 


8-70 

11-38 

8-69 

159-8 



1947] 


Associations in a Square Point Lattice 


103 


The danger of assessing probabilities for these distributions by means of the binomial or Poisson 
“approximations” is well illustrated by comparison of observed frequencies in the random sampling 
experiment with expectations for the theoretical distributions. The contrast is most marked for 
quadruplets, but is also clear for triplets. For example, with = 15 the expected number of 
quadruplets is 2-45, and the observed mean is 2*99; expected frequencies in samples of 100 from 
Poisson distributions with these means, taken roughly from Molina’s table (1945), by comparison 
with the observed frequencies, are : 


Poisxon with mean 


Frequency 


2*45 


Observc'i 

0 


9 

5 

28 

1 


21 

15 

29 

2 


26 

22-5 

8 

3 


21 

22*5 

7 

4 


13 

17 

10 

5 


6 

10 

2 

6 


3 

5 

2 

>6 


1 

3 

14 


Of the fourteen instances of more than 6 quadruplets, one has 21, another 20, and five more exceed 
10, a result virtually “impossible” for either Poisson distribution. For 20, there arc four 
sets having more than 50 quadruplets (the highest being 63), and ten others exceeding 20; all of 
these would be judged very highly significant deviations from a Poisson distribution of mean 8*7, 
yet all have arisen in a random sample of 100 trials. 

Enough has been said to emphasize the danger of using tests of signifiaincc for departures 
from random association based on the numbers of triplets and quadruplets, at least until fuller 
investigation of the sampling distributions of these statistics has been made. Though the distri- 
bution of the number of doublets in a random set of points may also be expected to depart from 
the binomial form, the results presented above suggest that the effect may be much less serious; 
even on this point, further theoretical or empirical evidence is needed before the test is adopted 
uncritically. 


References 

Kendall, M. G., and Babington Smith, B, (1939), Tracts for Computers, No, XXtV: Tables of Ramhm 
Sampling Numbers. Cambridge University Press. 

Molinii, E. C. (1945), Poisson s Exponential Binomial Limit. D. Van Nostrand Company, Inc. 

Todd, H. (1940), Note on random associations in a square point lattice. J. Roy. Statist. Soc. Suppl. 7, 
78*82. 



104 


Spencer-Smith— 77/^ Oscillatory 


(No. 1, 


The Oscillatory Properties of the Moving Average 

By J. L. Spencer-Smith 
(Linen Industry Research Association) 


Summary 

The process of eliminating trend from a time series by means of the method of moving 
averages is shown to be extremely dangerous if the series is being examined for oscillatory move- 
ments. It is shown that the “trend free” moving average difference series formed by the 
difference between the terms of the original series and the moving average is always liable to 
be oscillatoiy whether or not the original series contains oscillatory components. The expression 
for the serial correlation coefficients and variance of the moving average difference series is 
derived in terms of the serial correlation coefficients and variance of the original series. It is 
shown that for typical examples of auto-correlated non-oscillatory time series the method of 
moving averages is liable to introduce oscillations of considerable magnitude. 


1. Most time series exhibit a long term movement or trend in addition to the short term 
variations. The analysis of such series is usually concerned with the short term variations, as 
they are expected to be a part of the mechanism of the series, whereas the trend may be due to 
the impact of changing conditions. The presence of this trend complicates the analysis of the 
short term variations and different methods have been devised to eliminate it, the most usual 
being to consider the trend as defined locally by a polynomial, in which case it may be determined 
by some form of a moving average. Kendall 0941) discussed the effect of this method of 
eliminating trend on the oscillatory movements in time series, and came to the conclusion that 
when the moving average extends over a considerable number of terms approximating to a 
multiple of the length of any cycles which arc suspected to be present, taking the trend to be the 
moving average docs not distort seriously any genuine effects, nor does it introduce very marked 
spurious oscillations, i.e. to an extent greater than would be produced on a random series. 

Some recent work of mine threw some doubt on the validity of this conclusion and led to 
the investigation described below. 

Kendall (1941) considered a series which comprises a trend, an oscillatory portion, and a 
residual random portion. The trend was assumed to be defined by the moving average, and 
therefore he only considered the effect of taking the moving average process on the oscillatory 
and random portions. It seems to me that this model of a time series has not a general applica- 
tion, and that in certain ’cases eliminating trend by a moving average method may introduce a 
pronounced oscillatory movement. 

If trend is regarded in this way there is nothing more to be said, but this point of view will 
be shown to result in practically every series being regarded as the resultant of a moving average 
and an oscillatory series, as defined by Yule (1926). 

2. Everyone has his own ideas about what trend means. For my part, 1 cannot help trying 
to visualize the processes of any time series on the analogy of the cross sections of a textile yarn. 
1 make no apology for introducing a textile aspect to this paper, because 1 regard textiles as a 
very fruitful application of time series. In this type of time series, which is the same as a popula- 
tion series with a life corresponding to a fibre, variations in the mean density of the fibre ends 
produce a more or less smooth series which is very similar to the resultant of a smooth trend 
with random variations superimposed. But it is artificial to regard the series as made up in this 
way, because both features are produced by the same mechanism, and neither the smooth trend 
nor the short erratic fluctuations can be produced without the other. 



1947 ] 


Properties of the Moving Average 


105 


This type of series can only be analysed as a whole, as I hope to show in a future paper. Of 
course such series do not contain very prolonged steady increases or decreases in the general 
values of the terms, as may happen in economic series, and where such movements occur the use 
of the moving average method may be valid. 

In any time series the moving average will never correspond exactly to the terms of the series, 
but will exceed or be less than them with approximately equal frequency. When the series is 
autocorrelated, consecutive t^rms will tend to lie on the same side of the moving average, and 
thus terms further removed from each other will therefore tend to lie on opposite sides. Thus 
the moving average may be expected to oscillate about the original series, i.e. the series formed 
by the difference between the original terms and the moving average, which T shall term the moving 
average difference series, will be oscillatory whether the original series contains an oscillatory 
component or not. 


3. Let the terms of any time series be represented by — 

Uu Wa, i/a, etc., . . . i/a-, CtC., 

and suppose we take a (In f 1) term moving average with weights ~ 

n ^ I etc., . . . . . . etc., to //,j, 

aj-^[ (3.1) 

j - *n 

The moving average at x is — 

*« f - 1 - n H a _ „ + 1 ttx - « + 1 + etc., + flottx + + I I- etc.. to + a„u^ + „ (3.2) 

The value of the moving average difference series at x is therefore — 

Uj. ~ Uj^ an Ti ~ ^ ~ n ^^x — n ^~n -f- i — n + i “ CtC., CtC., tO 

(3-3) 

and at + 5’ it is— 

C/j; -j a — i/j; a 211 f 1 ^ x -|- * + « ^—n — n -f « etC., tO + n + » • (3*4) 

The terms of this series may be examined for oscillatory movements according to Yule's (1926) 
definition by evaluating the serial correlogram. We have— 

Covar. V^Uj: + • t/x + » 

/V 5 J. ^ y 


1 X == « 

yv . ^ 


( - n r- 
■ 0 


i/x^x + » ” ^x I ^ — n ^x — n + s i ^^x — n -f n -1 

etc., -f" -1 n 4 s 


^^x I- « “I ^ — n — n i' ^ ~ n -|- 1 — n i i CtC., -f ri/i^x + « } 

+ I „ //.c _ 4* ii - >1 4 i — n + i to a ^11^ ^ - n -V 9 


4* ^ — n 4- 1 — n H' 1 CtC., tO 4" //« Ux » -I- 


.}] • 


. ( 3 . 5 ) 


Provided that the series is long enough for the effect of the loss of the end terms to be negligible 
we may write — 

J N - k 

- + h ~ t'k 

N — k X- 0 



Spencer-Smith— 77/^ Oscillatory 


106 


[No. 1, 


Where Var U is the variance of the original scries and r/^ is the serial correlation coefficient of the 
original series for interval the expression becomes — 

Covar. t/, [/,_»- Vartt 2 -i i -« 1. 1 + etc., |-a„r, H-etc.,toa„r, + 

-h I citi . a._,i . rg _ an “t- (a „ „ . .j. i - ^ + i . a^Vg - m + i 


hi p 


m li/< 

-|- ^ ^ - n i m • ^n ~ p -r m • 'a - vtn 4 p ^ ^-n + )n • -* an 4- w • 

/n - 0 nt - 0 


-1 etc., i '''i: ' a„ - n 1 V - 1 ./f etc.,tor/„a_^r, „ ,A\ . . (3.6) 

ht - () J 

The variance of the moving average difference series, Var. is obtained by putting s ------ 0 in 

Equation (3.6) and the serial correlation coefficient Rg for interval s is of course — 

Covar. U, . Uir \. 

R, - (3.7) 

Var. U, 

4. Probably the most common type of moving average is when the weights a _ a _ ^ i, etc. 

to arc equal and have value \ , in which case - 
^ 2// 1 I, 


Var. C, p 

Covar. U^r • (7^ , « (2n I 1)V^ - 2(2// f- 1) 

(2// i- 1)*^ L 

'i j/^ - I 2r„ __ a;, ^ , 1 etc., 1 (2// { l)/% |- 2,,/-., i }- etc., -I 



1 

■}] 


+ etc., 1 r„ 
etc. -1- /•,. , ,, I 

. . (4.1) 


Yule’s (1926) definition of an oscillatory series is one in which the serial correlation coefficient 
changes sign, i.e. that Covar. C, C,. j g is negative for some values of .v. It will be seen that the 

expression in Equation (4,1) contains a negative term which may easily exceed the sum of the 

positive terms for certain values of s, even when the original series does not contain an oscillatory 
component. 

Equations (3.5), (3.6) and (3.7) show that the serial correlation of the moving average 
difference series depends only on the serial correlation coefficients of the original series and the 

weights of the terms in the moving average ; thus in considering whether the moving average 

difference series of any scries is oscillatory we need only consider the different types of series in 
terms of their serial correlograms. For this purpose, it seems to me that there are four funda- 
mental types of time series : (1) Random series, (2) Homogeneous series, (3) Oscillatory series 
and (4) Periodic series, which may be defined as follows: — 


(1) Random series, for which all the serial correlations are zero. 

(2) Homogeneous scries, for which («) all the serial correlation coefficients are positive 
or zero, and {h^ the change in the serial correlation coefficient for unit increase in the 

interval ^ is always negative or zero. 

s 


(3) Oscillatory series, as delined by Yule (1926), for which the serial correlation co- 
efficient («) changes sign, being alternatively positive and negative, and {h) the series formed 
by the successive maxima and minima of the correlogram converges to zero. 

(4) Periodic series, a special case of oscillatory series, for which the serial correlations 
are undamped harmonic functions of s. 

This represents a very slight modification of Yule’s (1926) classification. 

For the purpose of considering the serial correlograms only, series met with in practice may 
be regarded as the resultant of some or all of these types. Thus, for example, Wolfer’s sunspot 



1947] 


107 


Properties of the Moving Average 

numbers (cf Spencer-Smith (1944)) may be regarded as the resultant 
oscillatory and a periodic series. 


of a homogeneous, an 


5. We now consider the moving average difference series of any serially correlated time series 
represented in this artificial manner. Suppose the terms of the series to be capable of representa- 
tion by — 

fix) -= Aix) 4- fix) +Mx) (5.1) 

where /i(.x:) represents the terms of a homogeneous series, 
f(x) represents the terms of an oscillatory scries, 

Mx) represents the terms of a periodic series, 

any of which may or may not be zero. There is no need to include a random series when the 
series has a serially correlated component. The serial correlation coefficient of the series for 
interval s is — 


Var./iU) f o'** Var./ 2 (.v) 1 Var. /if.v) 

" Var./dx) + Var./ 2 (^v) f VarT^U^^ * ’ • P-) 

where ^r^ and ;,r,. are the serial correlation coefficients of fi{x\ ffx) and fix) respectively. 
The terms F(x) of the moving average difference scries of f(x) are given by— 

Fix) F,(x) -I- F,{x) F,{x) (5.3) 

where Fy{x), F>(x) and F:,(.v) are the moving average difference series of/, ( a), /j(jc) and /,( a 0. 
The serial correlation coefficient of F(x) for interval .y is - 


Var. Fi(x) + Var. ^^(a') i- ^R, Var. FJx) 

Var. F,(a) 1 Var. F,(x) + Var. F^ix) • • i^‘^) 

where ,/?.s, jR^ and jR^ are the corresponding serial correlation coefficients of F,(a), F>(x) and 
F,(x). 

From the practical point of view, viz. analysing the time series, the important consideration 
is : by how much will R^^ differ from — 

,r, Var. /.(A) + .,r,. Var. /aCv) 

Var. / 2 (A) 1- Var. /.;(A) 

which represents the serial coriclogram of the oscillatory and periodic portions only? 


6. To answer this question wc examine the four basic types of component series, for oscillatory 
movements. 


(1) Random Series. 

Yule (1926) has shown that the moving average difference series of a random series is oscil- 
latory in the sense defined above, and Kendall (1941) discussed its effect when an oscillatory 
series is also present. Provided that the moving average extends over a fair number of terms 
the negative correlations will always be small. 


(2) Honiogefieous Series. 

The serial correlograms of homogeneous scries vary so widely that it seems practically impos- 
sible to cover every case, but the following cases are worth considering. 

Wlicn the serial correlogram of the original series is given by — 

/*, ( 6 . 1 ) 


where. O <:/?<! it is easy to show that — 

/* ' 

(a) Rg is negative and - * is positive when .y -- //, 

As 


yy /* 

(b) Rg is positive and is negative when s > 2n, 

so that the moving average difference series is oscillatory. This will obviously hold also -when 
rg is the mean of a number of similar terms having different values of p. 



108 


Spencer-Smith— 77r^ Oscillatory 


[No. I 


In general, when decreases more rapidly than the positive portion of Covar. Ux^x-^t 
in Equation (3.6) decreases more rapidly than the negative portion as s increases, in the neighbour- 
hood of .V n. Thus it may be concluded that the moving average difference series of a homo- 

geneous series is usually oscillatory. This result is proved in a different way in Section 8. 

The serial correlation coefficients /?, for interval s of the 9 term moving average difference 
scries are given in Table 1 for the following cases of homogeneous series : 

Case 1, in which the serial correlation coefficient is given by — 
r, - (0•9)^ 

Case 2, in which— r^ -= (0*7)\ 

Case 3, in which— /•« ---= 1 — (0*1).? for s < 10, 

and r* = 0 for .v > 10, 

Case 4, for a series with an auto-regression equation — 

- 0-9 i - 0-2 _ 2 -i- E^. 

This series is not oscillatory by Kendall’s (1943) criterion. 

The serial correlation coefficients of the original series are also given in the table together 
with the relative variance of the moving average difference series, which is defined as the ratio 
of the variance of the moving average difference series to that of the original series. 


Table I. — Si rial Correlation Coefficients of the 9-term Moving Average 


Difference Series of Different Homogeneous Series 


Case 1. 

SorUvs wiUi Hcrial 
c(>rn*iat.i()u 
coftlllcioiits. 


(’ASK 2. 

w’th serial 
correlation 
coetti<‘leiits. 


Cask 

Series with serial 
correlation 
eoelhelentR. 


Cask -4, 
Series given by 
anto regression 
et] tint ion. 




(0-0)'' 


(0-7)" 

g 

0 

^ ^ 10 

0 h)r « ,'10 

- 0-2u 

0-9m^, _ j 


Serial 

S( rial 
correlation 

Serial 

Serial 

eorn‘lati(»n 

Serial 

Serial ^ 
correlation 

Serial 

Serial ' 
correlation 


Cfuif'lation 

ef)enielent 

eorrelat ion 

C(H‘iriclent 

correlation 

eoetfielent 

correlation 

eoefflelent 


coertieient 

of 9 term 

eoettleient 

of 9 term 

coeffi(;ient 

ol 9 tenn 

coefficient 

of 9 term 


of 

moving 

of 

moving 

ol 

mr)ving 

of 

moving 


original 

avei age 

original 

average 

original 

avM^rage 

original 

average 


H(‘rieH 

(lllferenec 

H«!riefl 

clItTereiiee 

series 

difference 

series 

(lifferenee 

rg. 

series 


series 


Hi-ries 


•stories 







g. 


0 

. 1 000 

1 000 

. 1-000 

1-000 

. 1-000 

1 -000 . 

1-000 

1-000 

1 

. 0-900 

0-398 

. 0-700 

0 316 

. 0-900 

0-400 . 

0-750 

0-481 

2 

. 0-810 

-0-143 

. 0-490 

-0-102 

. 0-800 

-0-033 . 

0-475 

-0-100 

3 

. 0-729 

-0-449 

. 0-343 

0-332 

. 0-700 

-0-283 . 

0-278 

-0-412 

4 

. 0-656 

-0 490 

. 0-240 

-0-353 

. 0-600 

-0-300 . 

0-155 

-0-437 

5 

. 0-590 

-0-256 

. 0-168 

-0-141 

. 0-500 

-0-083 . 

0-084 

-0-210 

6 

. 0 531 

-0-123 

. 0-118 

-0 031 

. 0-400 

0-100 . 

0-044 

- 0-048 

7 

. 0-478 

0-061 

. 0-083 

0-013 

. 0-300 

0-125 . 

0-025 

0-033 

8 

, 0-430 

0-020 

. 0-058 

0-022 

. 0-200 

0-017 . 

0-013 

0-039 

9 

. 0-387 

0-018 

. 0-041 

0-015 

. 0-100 

-0-200 . 

0-007 

0 024 

10 

. 0-348 

0-016 

. 0-029 

0-011 

. 0-000 

-0-500 . 

■ 0-004 

0 013 

11 

. 0-313 

0-014 

. 0-020 

0-008 

. 0-000 

-0-200 . 

0-002 

0 000 

12 

. 0-281 

0 013 

. 0-014 

0-006 

. 0-000 

0-017 . 

0-001 

-0-005 

13 

. 0-253 

0-012 

. O-OlO 

0-004 

. 0-000 

0-142 . 

0-000 

-0-002 

14 

15 

. Tends to 

. Tends to 

. 0-000 

. 0-000 

0-167 

0 083 . 

Tends to 

16 

zero 

. zero 



zero 

fl.V. 

0- 

138 

Q-437 

Tends to 
. zero 

0-148 

0-453 



1947] 


Properties of the Moving Average 


109 


These results show that in each case the moving average difference series is oscillatory and 
may have by no means negligible variance. Case 3 corresponds to an experimental 1,000 term 
series which I had available, and for which I calculated the 9 term moving average difference 
series. The actual serial correlation coefficients of this scries correspond very closely with the 
calculated values for all intervals, as the following few examples show: 

= 0-340, - 0-256, 0-115, = - 0-463 and 0-131. 

The effect of the number of terms used in the moving average is illustrated in Table 2 for a scries 
in which — 

rg - \ — (0-2).? for s < 5 
and ’ 0 for .v > 5 


Table 2 

Serial correlation coetftcients of inovinK averiiKi- dilference series. 



S tonii. 

r> term. 

7 tenn. 


1) term. 


I l tcrni 
innviti}; 
averaKf. 

1 

-0-500 

0-000 

0-322 


0-515 


0-600 

2 

0-000 

-0-400 

-0-036 


0-129 


0-179 

3 

0-000 

0-100 

. -0-250 


-0-228 


- 0-152 

4 

0-250 

0-000 

. -0-268 


-0-457 


0-420 

5 

-0-500 

■ 0-500 

. -0-535 


-0-571 


-0-620 

6 

0-250 

0-000 

. -0-125 


-0-228 


-0-328 

7 

0-000 

0-200 

0-107 


0 000 


-0-110 

8 

0-000 

0-050 

0-178 


0-121 


0-035 

9 


0-000 

0-072 


0-143 


0-110 

10 



0 018 


0-072 


0-128 

11 



0-000 


0-029 


0-069 

12 





0 007 


0 035 

13 





0-000 


0-014 

14 





. . 


0-003 

15 

Relative 







0-0005 

variance 

-089 

•160 

•237 

. 

•333 


-470 


In this case both the oscillatory nature and the relative variance of the moving average difference 
series increase as the number of terms in the moving average increases, as indeed follows from 
Equations (3.6) and (3.7). The introduction of two minimum values in the correlogram of 
the 3 and 5 term moving average difference scries should be noted. 


(3) Oscillatory Series, 

When the original series is oscillatory, a number of the serial correlation coefficients will be 
negative ; thus the value of Covar. , in Equation (3 . 7), and therefore of the serial 

correlation coefficients of the moving average difference series, will be largely determined by the 
corresponding serial correlation coefficients of the original series. But if, as is usual, the correlo- 
gram of an oscillatory series is damped, the sum of the remaining terms in the expression will 
not be zero, and neither the correlogram nor the variance of the moving average difference series 
will equal those of the original series. This difference will increase with the degree of damping 
of the original correlogram, and is illustrated by the calculated values of the serial correlograms 
and relative variances of the 9 term moving average difference series for two oscillatory scries 
with different degrees of damping, given in Table 3. 



no 


Spencer-Smith — The Oscillatory 


[No. 1 


Table ^.—Serial Correlation Coefficients of the 9 Term Moving Average 
Difference Series of Oscillatory Series 


('ASK 1 * CASK 

Oscillatory m licH for which OHcillatory aeries for hlcli 




Serial correlation 

Serial correlation 

SiTlal correlation 

Serial correlation 



<;(>t‘ltl<‘ient of 

coefll«*lcnt of 9 term 

coptftcient of 

coefficient of 9 term 



urii{inal 

mo\in^ a\eraKe 

ori^'lnal 

moving average 



' w'rlcH 

ditference Hcries 

series 

difference series 







0 


1000 

1*000 

1*000 

1-000 

1 


0*627 

0-557 

0*514 

0*460 

2 


0*117 

-0*092 

0*078 

- 0*218 

3 


-0*278 

-0*546 

-0*150 

-0 532 

4 


-0*422 

- 0*615 

-0*190 

0-472 

5 


0*346 

-0*366 

-0*127 

-0-118 

6 


0*151 

0*002 

-0*046 

0-087 

7 


0*043 

0-245 

0*011 

0-169 

8 


0*155 

0*262 

0*031 

0*128 

9 


0*165 

0*196 

0 027 

0*063 

10 


0*103 

0 140 

0*014 

-0 009 

11 


0*019 

-0*040 



12 


-0*045 

0-113 



13 


0 070 

-0*089 



. 14 

, 

-0*057 

0*065 



Relative 






variance 


0-885 


0*747 


A comparison of these serial correlograms with those given in Table I, which are derived 
from homogeneous series, a striking similarity between the two and illustrates the danger of 
using the moving average to eliminate trend. 


(4) Periodic Series, 

There is little need to mention the case of a strictly periodic series : when the number of terms 
in the moving average is equal to the length of the period, it is obvious that the moving average 
difference series will correspond exactly to the original series. 

The conclusion is that when equal weights arc given to the terms of the moving average, the 
moving average difference series is always oscillatory, whether the original series contains an 
oscillatory component or not. When the original series contains an oscillatory component, the 
serial correlogram of the moving average difference series may approximate to that of the oscilla- 
tory portion of the series, but such agreement must be regarded as a coincidence. 

When the terms of the moving average have different weights, the problem is more com- 
plicated, but it is treated in another way in section 8. When the weights of the terms of the 
moving average rise to a pronounced maximum in the region of the central term it is obvious 
from Equation (3.7) that the relative variance of the moving average difference series will be 
much smaller than when equal weights arc used. 

7. Before considering the' general case of a moving average having different weights it is 
interesting to investigate the moving average difference series of a homogeneous series in rather 
more detail. Such a series is usually oscillatory and might be expected to approximate to the 
type of series investigated by Kendall (1944), viz. that generated by the auto regression equation 

^ a + aujc 1 -f bu^ + + 2 ~ 0 . . . , (7.1) 



1947] 


111 


Properties of the Moving Average 

where a and b are constants such that 46 > a\ and ^ a is a random term. Kendall (loc.cit.) 
showed that the serial correlation coefficient for interval s, of— 


where — 


and — 


_ /?* sin (O 5 4^) 

~ sin ^ 

0 = tan — 1 — 1 and tan 4* ™ ^ ^2 * 

p y/h 


(7.2) 


the values of a and 6. 

Yule (1927) and Walker (1931) showed that the values of a and b for an auto regression 
equation of the type of (7.1), ‘from the difference equations of the serial correlogram, viz.: 

rj -f- 2 ^^0 r 1 ^ • • • • . (7.3) 

where j = —1,0, 1,2, etc. 


Kendall (loc. cit.) followed Yule and used j -1 and 0 to derive his values of a and 6, as 
follows : 


a — 


/•i . (1 -ra) -ro 

1 ^ “ 1 - ri'* 


(7.4) 


where ry and r^ are the serial correlation coefficients for intervals 1 and 2 respectively. He also 
found that the oscillation period calculated from equations (7.2) and (7.4) was liable to be less 
than that estimated from the positions of the maxima of the correlogram, a fact which he showed 
might be attributed to the presence of a superimposed random scries. 

in applying equation (7. 1) to the moving average difference scries of the special homogeneous 
scries considered in section 5, i.e. cases 1 to 4 in Table 1, the possibility of a residual random 
component cannot be ignored, and it seems preferable to calculate a and b from equation (7.3) 
using j = \ and 2. 

These values of a and b may be used to extrapolate the correlogram back to the value of 
for the oscillatory portion of the scries for .v 0, and thus the variance of the random component, 
Var. F, may be calculated as — 

Var. E (I - O Van w (7.5) 

where Var. ii is the variance of the series. 

The values of a and h calculated by Equation (7.3) with j — 1 and 2 from Rx, R^ and /?* 
for the four moving average difference series considered in Table 1 are given in Table 4, with the 
values calculated by Yule’s equation (7.4). 


Table 4. — Calculated Values of a and b 




Calculated by Equation ( 
with j — 1 , 2. 

7.3) 

Calcuiat,«Ml l)y KemiaU's Method. 
Equation (7.4). 



— a. 

b. 

Oscillation 

|x;riod. 

- a. 

h. 

Oscillation 

I>criod. 

Case 

1 .. 

1*305 

0*659 

9*85 

0*541 

0*358 

5*68 


2 .. 

1*260 

0*635 

8*85 

0*387 

0*215 

5*50 

*» 

3 .. 

1*022 

0*625 

7*25 

0*492 

0*230 

10*24 

»» 

4 .. 

.. 1*209 

0*606 

7*98 

0*690 

0*431 

6*17 


The comparison of the serial correlation coefficients of these four fneving average difference 
series with those calculated by Equation (7 . 3) from a and b as calculated from j 1 and 2 and 
from Rx and Rt are given in Table 5. 



112 


Spencer-Smith — The Oscillatory 


[No. 1, 


Table 5 


Movlni? uvt*raK«’ difiFormcf 
of Serli-s 1. 




Moving avt*raKe dfTTereiitc 
.Series of Series 4 given by 





^ - 

(0-7)* 

^ j - ()•!« for « ^ 10, 

\ 4- 2 

1 






0 for a > 10 


■K. .4. •- 




Serial 


Serial 


Serial 


vul. 

corrolatiim 



calculated 

ci)n-clation 

calculati'd 

correlation 

calculated 

ciM'rtiriiTit- 

from 

rdfillciont 

Irom 

<‘oerticleiit 

iruni 

coefficient 

from 


of IMOVillK 


of nioviriK 


of ftiovifiij; 

^ - 

of moving 



uvcni^c 


aMTHurt- 

1 

uveraKc 

•ilirfrence 


averapo 

dHTcrenec 



wrii'H 

O-OO/f^ ^ 



MCI icM 


aericH 

K 

- 0-606 

0 

. 1 000 

1-005 . 

1-000 

0-790 

1-000 

0-708 . 

1 000 

1-125 

I 

0-398 

0-398* . 

0 316 

0-316* 

0-400 

0-400* . 

0-481 

0-481* 

2 

. - 0-143 

-0-143* . 

0-102 

0-102* 

0-033 

- 0-033* 

0-100 

-0100* 

3 

. -0-449 

-0-449* . 

-0-332 

-0-332* 

- 0-283 

0-283* . 

-0-412 

0-412* 

4 

. -0-490 

0-490* . 

0-353 

0-353* 

-6-300 

- 0-300* . 

-0-437 

-0-437* 

5 

. - 0-256 

0-343 . 

0-141 

0-245 

0-083 

-0-100 . 

-0-210 

- 0-270 

6 

. -0-123 

^ 0-126 . 

0-031 

0-079 

0-100 

-f-0-069 . 

-0 048 

0-068 

7 

0-061 

0-061 . 

0-013 

1 0-056 

0-125 

-f 0-134 . 

0-033 

1 0-082 

8 

. 0-020 

0 163 

. 0-022 

-{ 0 -121 

0 017 

i 0-094 . 

0-039 

4-0-140 

9 

. 0-018 

0-182 

0-015 

1 0-116 

-0-200 

-j-0012 . 

. 0-024 

4 0-119 

10 

, 0 016 

0-131 . 

0-011 

10-069 

- 0-500 

-0-047 . 

0-013 

4 0-059 

11 

. 0-014 


0-008 

40-016 

-0-200 

-0 057 . 

0 000 

-0-001 

12 

. 0-013 


0-006 


0-017 


-0-005 

-0-037 


* Used to calculate the other coefficients. 


In each case there is fair agreement between the two values as fas as R^, after which the two 
diverge. This divergence could, however, only be detected in practice if a very long series was 
available. The extrapolated value of for Case 1 is sensibly unity, so that this moving average 
difference series approximates closely to Kendall’s auto regressive scheme. Cases 2 and 3 
show similar fits, although here the extrapolated values of Ro are less than unity, whilst in 
Case 4 it exceeds it. 

I do not consider that this correspondence between the moving average difference scries of 
these homogeneous series and Kendall’s auto regressive type of series has any special significance, 
but it serves to emphasize the danger of using the moving average when searching for an oscillatory 
movement. 

The difference between the values of a and h as calculated for Case 1 by the two methods 
seems surprising at first sight in view of the fact that the extrapolated value of r„ for the oscillatory 
part of the correlogram approximates to unity. The reason is, of course, that the auto regression 
equation (7.1) does not hold: the difference equation, 

rj ^ a 4- arj (7.3) 

only holds for j .->0. This particular moving average difference equation approximates to the 
second order process discussed by Bartlett (1946). 

8. The conclusion in section 6 can be reached in a different way. Practically any serial 
correlogram of the three basic types described in section 4 can be prt^uced by a series whose 
terms are formed by a moving sum with appropriate weights, of a series of random numbers, 
thus: 

f ^ m 

Ujc ^ y: a; . Ej _ ( 8 . 1 ) 

— I 



1947] 


Properties of the Moving Average 


113 


where £'^ etc., are the terms of a random series, and . . . «,*, etc., are the appropriate 

weights.* Dodd (1939) has shown that the serial correlation coefficient of such a series for 
interval s is — 

r, = S a^a^ + , 'Z a* . . . . (8.2) 

’ ^ /<=:=() 


From this it follows that the conditions which limit the values of Ou etc., for the basic types 
of series are: 


(1) For a homogeneous series: ai is positive or zero for all values of / and 
does not change sign. 

I m 

(2) For an oscillatory series : changes sign, m is finite and ^ at 0. 


/ - 0 


A* a 
(A/T 


(3) For a periodic series: the series w == qo, and the series (h . • . etc., a,, etc., 

forms an infinite periodic series. 


The moving average difference series of a series of the type given in Equation (8.1) may also 
be expressed as a moving average with different weights, of the same random series. But if, as 
is essential in all graduation formulae, the sum of the weights in the moving average is unity, 
the sum of the weights in the moving sum of the random series representing the moving average 
difference series will be zero. Thus by the second condition above the moving average difference 
series of any tin]e scries is always liable to be oscillatory, no matter what weights are given to 
the moving average. 

From the foregoing remarks it is clear that the method of eliminating trend by means of a 
moving average is an exceedingly dangerous process for investigating oscillatory movements, 
since the method is in itself a sure way of generating an oscillatory series. The method may 
be applicable in certain series in which the terms show a general upward or downward tendency, 
but even in this case the oscillations in the residual series should be viewed with suspicion unless 
there is other evidence about them. 

The question arises : How genuine are the oscillations in moving average difference series 
such as Kendall’s (1943) statistics of British agriculture? Are they merely the result of the moving 
average process ? Certainly they show a striking similarity to the oscillations of the moving 
average difference series of a homogeneous series, but this* is not evidence that they arc not 
genuine. The moving average difference provides no evidence either for or against, and until 
they are confirmed by other methods they must be treated with suspicion. 

In conclusion I should like to express my thanks to Mr. M. G. Kendall for helpful criticism, 
and to the Director and Council of The Linen Industry Research Association for permission to 
publish this paper. 


References. 

Kendall, M. G. (1941), “ The effect of the elimination of trend on oscillations in time series,” J. Row Star. 
Soc., 104, 43. 

Yule, G. U. (1926), “ Why do we sometimes get nonsease correlations in time series?”, ibUL, 89, I. 
Spencer-Smith, J. L. (1944), ” On the specification of disturbed periodic series,” ibid., 107, 231. 

Dodd, E. L. (1939), “ On the length of cycles from the graduation of chance elements,” Ann. Math. Stat., 
10, 254. 

Kendall, M. G. (1943), ” Oscillatory movements in British agriculture,” J. Roy. Stat. Soc., 106, 91. 
Kendall, M. G. (1944), ” On auto regressive time series,” Biometrika, 33, 104. 

Yule, G. U. (1927), ” On a method of investigating periodicities in disturbed series with special reference 
to Wolfer’s sunspot numbers,” Phil. Trans., 226a, 267. 

Walker, Sir G. (1931), ” On periodicity in related terms,” Proc. Roy. Soc., 131a, 518. 

Bartlett (1946), “ Symposium on Serial Correlation,” Royal Statistical Society. Research section January 
29, 1946. 

* Dodd (1939) considers all seriesof this type to be oscillatory, but If we follow Yule’s (1926) definition, 
this is only true if thq. second condition given Mow holds true. 

SUPP. VOL. IX. NO. 1. I 



114 


Slater — The Factor Analysis of a Matrix of 2 'X 2 Tables 


[No. I, 


The Factor Analysis of a Matrix of 2 x 2 Tables 
By Patrick Slater 

1. Origin of the Present Paper 

A paper with the title “A method of factor analysis for application to matrices of 2 x 2 
tables” was circulated privately to members of the Advisory Committee of the Directorate of 
Selection of Personnel, War Office, in January, 1946. It described the method of analysis and 
the test of significance given here. Less discussion was given of other alternative methods, more 
attention was paid to working methods, and the example used was an artificial one. 

In this paper I have used experimental data referred to me for analysis by Dr. W. Mayer-Gross, 
but it is on the method of analysis and the test of significance, which are the same as in the original 
paper, that 1 would particularly welcome comments. The psychological interpretation of the 
results will be presented for discussion elsewhere. 

2. The General Form of the Problems to which Factor Analysis is Applicable 

Factor analysiF is a statistical procedure often used in attempts to solve psychological problems. 
These problems are generally similar in form. They arise because there is no limit to the number 
of ways (variables or attributes) in which human variation may be observed and recorded, but 
it is commonly thought that the variations observed can be ascribed to underlying psychological 
characteristics (factors) the number of which is limited; or, at least, that some^of these charac- 
teristics account for human behaviour to a greater extent than others, and that if accurate assess- 
ments of a limited number of them can be obtained, variations of human behaviour in a great 
many other respects can be predicted from them. 

A hypothesis may therefore be set up, such as, that variation observed and recorded in terms 
of variables a, 6, c, etc., is determined by factor I ; variation in terms of variables /?, o. />, etc., 
by factor 2, and so on. An experiment may be conducted to observe a number of individuals 
and record their variation in terms of all the variables. The correlations are obtained between 
the individuals' measurements in terms of each variable and every other. Using / and j to indicate 
any two variables, the correlations form a matrix which may be written [rj^]. Under the condi- 
tion / j this is a square symmetrical matrix with no entries in its leading diagonal. Applied 
to this, factor analysis derives one or more independent sets of loadings, each of which may be 
written as a column matrix [/J, [I,], . . .,etc. In each column there is one loading for each 
of the variables, and loadings of different variables may vary between ^ 1 and 1. The 
sum of 

[/i][/ir H [/.][/.]' r . . .,etc. 

(when [/]' is [/] written as a tow instead of a column) provides a matrix of expected correlation 
coefficients [e'ij]. The hypothesis is considered confirmed if the variables associated with a 
particular factor are found to have significant loadings in the same column, and if, under the 
condition i ^ y, [e'ij] does not differ significantly from [r^]. 

More often the hypothesis is not precisely defined. If a new psychological test or a new 
method of observation has been designed, an experiment may be made to find with what factors 
the variations it records are associated. Other variables, with factorial associations known 
from previous experiments, are included with it, and its factor loadings are found and compared 
with theirs. Or no prior hypothesis advanced : a psychological explanation is sought, post hoc, 
for the fact that a group of variables prove on analysis to have significant loadings in the same 
column. Factor analysis can also be applied to correlations between persons, and persons with 
similar factor loadings are described as belonging to the same type. 

Various methods of factor analysis have been described. Useful reviews of alternative methods 
are given by Holzinger and Harman,' Burt* and Thomson*. Whatever method is used, the 
problem must arise of testing how well the hypothetical matrix [e'ij] fits the observed matrix 
[rij]. Various solutions have been proposed. Spearman's use of tetrad differences is of historical 



1947] Slater — The Factor Analysis of a Matrix of 2 x 2 Tables 115 

interest/ and Kelley's contributions to the subject should be noted ('^ and ®), but Holzinger and 
Harman’s recent review shows that no solution has yet been accepted as authoritative. 

If two methods of analysis are applied to the same matrix, the one which gives the poorer fit 
after a given number of factors have been extracted will leave the larger residual correlations for 
the extraction of further factors. Paradoxically, therefore, the more inefficient the method used, 
the more factors will be discovered. The test of goodness of fit proposed by Burt {op. cit., p. 339) 
makes due allowance for the increasing stringency of the requirements for a good fit as the number 
of factors postulated increases. 

In the analysis described here, the method for computing the terms used in place of factor 
loadings is adapted from Thurstone’, and the method for testing goodness of fit from Burt. My 
chief aim in the adaptations 1 have made has been to eliminate unnecessary intermediate processes 
in the argument and in the method of computation, and to illustrate the essential steps clearly. 
I hope that as a result, attention can be concentrated on the problem of finding a satisfactory 
lest of goodness of fit, and this problem can be brought nearer to a complete solution. 

The essential function of factor analysis is to classify — to test, for instance, whether variables 
a, h, r, etc., belong in a single class, whether this is the same class as that which includes variables 
//, o, p, etc., or not ; whether an indeterminate variable properly belongs to one class or another, 
and so on. In this way it may subserve psychological arguments concerning characteristics 
supposedly common to different variables ; but it may also be applied as a classificatory procedure 
when the characteristics postulated are not psychological. It has possible applications outside 
psychology. 


3. Its Application to a Matrix of 2 ' 2 Tables 
Table 1 shows a 2 - 2 table with the notation which will be used subsequently. 


Table 1. — Attribute t 


rre^out. AbwMil. Totsil 


Attribute J 

J \ 

Present 

Absent 

Oij 

Ui - 0,j 

aj 

n — Ui 

- o,j 

-- a, 4 - u,j 

// -aj 


Total 

ai 

n 

ai 

n 


This table summarizes the statements that, of n individuals, a^ possess attribute /, oj attribute /, 
and Oij both attributes, and shows the logical consequences of these statements. 

Such a table is the simplest and most general form in which the association between two 
attributes can be shown. Tables showing more complex forms of association, such as those 
used for calculating product moment correlation coefficients, can be reduced to 2 x 2 tables, 
though not necessarily without some loss of accuracy. A method of factor analysis suitable 
for matrices of such tables will therefore have the utmost generality, but will not necessarily be 
the most efficient in particular instances. 

The method adopted in the present instance is applied to observations of the occurrence or 
non-occurrence of symptoms and pairs of symptoms which have been found to differentiate 
officers who have suffered some form of neurosis from other officers in the forces. The problem 
to be considered is whether the frequencies with which the pairs of symptoms were observed 
concomitantly afford evidence of any special syndromes among the symptoms, such as might be 
used to distinguish different fotms of neurosis. 

Table 2 gives a descriptive heading for each item (symptom), and shows how often each was 
observed in a composite group of 256 officers, 201 of whom had received treatment for neurosis. 
The values of x* given in the final column show to what extent each discriminates the neurotics 
from the normals, au the number of times item / occurs, = 90, when I ^ 1 , 54 when / == 2, 
etc. The total of such numbers, 1,101 in this example, will be indicated as 

Table 3 shows how often each item occurred concomitantly with every other. Oij, the number 
of times items I and J were observed concomitantly, — 24 when I — I and J — 2 (or vice versa\ 
etc. The matrix [oij] would be shown in full by Table 3 if its lower left-hand half were filled in 



116 


Slater — The Factor Analysis of a Matrix of 2 'X 2 Tables [No. 1, 


Table 2. — Frequency of Occurrence of Individual Symptoms (items) 


Symptom. 

(a) AmuiiK ^01 

(6) Among 55 

(r) Among 256 

60* 

THMirotic olllcerrt. 

normal oftiewrs. 

all cases. 

x». 

1. Heredity 

83 

7 

90 

14 23 

2. Physical ill-health 

54 

0 

54 

1715 

3. Neurotic traits in childhood . 

91 

9 

100 

13 97 

4. Former psychiatric illness 

74 

3 

77 

18 73 

5. Shy, solitary, etc., in childhood 

80 

6 

86 

14-89 

6. Difficulty in making social contact . 

66 

4 

70 

12-95 

7. Emotional instability . 

146 

5 

151 

69 48 

8. Obsessional features 

74 

7 

81 

10 50 

9. Apprehensiveness 

118 

1 

119 

53 92 

10. Dependence 

107 

2 

109 

41-44 

1 1 . Unstable work record . 

47 

4 

. • 51 

6 05 

12. Marriage or sexual difficulties 

68 

3 

71 

15 96 

13. Alcoholism 

42 

0 

42 

12-27 

Total 



1,101 



* Using Yates' correction. 


Table 3. — Frequency of Concomitance of Pairs of Symptoms (items) 


Item. 

1 . 

2. 

:L 

L 


6. 

7. 

s. 

1). 

10. 

11. 

12 

10 

Total of 
row. 

1 

__ 

24 

48 

32 

36 

29 

68 

31 

56 

48 

19 

30 

17 

. 438 

n 




29 

25 

31 

19 

42 

20 

38 

32 

16 

17 

10 

. 279 

.3 ,* 





48 

56 

42 

78 

42 

75 

60 

28 

35 

18 

. 482 

4 





33 

28 

63 

33 

50 

57 

23 

33 

21 

. 341 

5 






54 

67 

34 

62 

57 

26 

36 

13 

. 349 

6 






— 

55 

32 

52 

47 

24 

29 

12 

. 251 

7 . 







_ 

54 

91 

82 

38 

59 

37 

. .361 

8 








— 

50 

43 

16 

27 

9 

. 145 

9 










72 

30 

44 

24 

. 170 

10 










— 

29 

43 

20 

. 92 

11 











— 

23 

15 

. 38 

12 

• 











— 

17 

. 17 

13 













- 

. 

fotal of 
column . 
Total ol 

- 

24 

77 

105 

156 

172 

373 

246 

474 

498 

249 

376 

213 

. 2,963 

row 

438 

279 

482 

341 

349 

251 

361 

145 

170 

92 

38 

17 


. 2,963 

Total of 














row plus 
column . 

438 

303 

559 

446 

505 

423 

734 

391 

644 

590 

287 

393 

213 

. 5,926 


Note. —The sum of the row totals checks with the sum of the column totals. 


symmetrically, '^oi, the total number of times item / was observed in concomitance with any 
other item, 438 when / ~ 1 , 303 when I =2y etc. ^Oy the sum of all the terms in [otj] 5,926. 
The application of factor analysis directly to this matrix is described below. Some reasons for 
preferring to any matrix of derived coefficients are given afterwards. 

Assuming no association between items / and /, their expected frequency of concomitance, 
Cij OtOjIn, This can also be written as npipj where pi is the probability of finding / in a single 
case ( ' Oiln). To test whether u/j differs significantly from e*;, we can calculate 

y-* ^ ~ I, I , (?ij ~~ ^ ijif 

' e,j a, - Cij Oj — Cij n — ai — aj + Cij ’ • y ) 

with one degree of freedom. Thus all the values of Cij obtainable from the values of a* and aj 
in Table 2 can be compared with the corresponding values of Oij in Table 3, and 78 values of x® 
will be obtained, each with one degree of freedom. In general, where there are m items, there 
are im{m - 1) independent values of Oij and the same number of values of x* can be obtained 



1947] Slater — The Factor Analysis of a Matrix of 2 2 Tables 117 

without duplication. On other assumptions other values of and accordingly of can be 
obtained. 

The sum of such values of (Sx*) is used here to test the goodness of fit of various hypo- 
thetical matrices [eijl to the observed matrix On the assumption of no association Sx* is 

allotted ^ m(m — 1) degrees of freedom ; and as other hypothetical matrices are obtained by drawing 
on information contained in the observed matrix [(?,;], corresponding reductions are made in 
the degrees of freedom allotted to Sx*- Similar procedures have been proposed by Kelley* and 
Burt. Burt gives a general indication of the extent to which different hypotheses reduce the 
degrees of freedom of Sx*. There may, however, be some valid objection to summing values 
of x**j* This is a point on which 1 would particularly welcome comments. 

4. The Extraction of a Single General Factor 

The assumption of no association is so wildly improbable in this instance that there is no 
need to compute the value of 2Ix* for testing it. 

The first assumption which deserves serious consideration is that all the items belong to a 
single general class, and that there is no need to differentiate any subclasses. Accordingly let us 
suppose that no item has a higher probability of occurring concomitantly with one item than 
with a second, and let us use p'i to indicate the general probability that item I will occur con- 
comitantly with any other item /. Then every value of Oij - eij (where eij - OiOjIn) will serve 
as an estimate of np',p'j, and all the observed values of o,j Cij for a given / can be used to 
compute an average value of p/. After computing p/ for each item in turn, we can compute 
the matrix [e'/j], where e'lj =- aiOjIn -1 np'ip'j \ and if our assumption is correct, we shall find 
no significant difference between [eUj] and 

For the sake of convenience, the values to compute are those of p'i\/n. They can be described 
as “factor loadings.” Table 4 shows the worksheet used for computing them in this instance, 


Tabi e 4. Worksheet Used in Computing [p/\/n] on the Assumption 
of a Single General Factor 


Item 

i. 

2. 

.{, DllfeieiKM* 

4. (fiiessetl 
eoinmu- 

Tk Flrnt 

KiHf. 

eKlimattM)!* 7 '' 

i). Thiiil 
estimati* 





iiallty. 

•>I 


of 

1 

. 438 

. 355 4297 

82 5703 

4 4 . 

2 0748 

4 3048 

. 2 0738 

T 

. 303 

. 2208516 

82 1484 

4 3 . 

2 0623 

. 4 2532 

. 2 0626 

3 

. 559 

. 391 0156 

167 9844 

. 21 0 . 

4 5085 

. 20 3263 

. 4 4925 

4 

. 446 

. 308 0000 

138 0000 

. 13 0 . 

3 6023 

. 12 9766 

. 3 6044 

5 

. 505 

. 340 9766 

164 0234 

. 19 0 . 

4 3663 

190642 

. 4 3716 

6 

. 423 

281 9141 

141 0859 

. 14 0 . 

3 6998 

. 13 6883 

. 3 6939 

7 

. 734 

. 560 3516 

173 6484 

, 22 0 . 

4 6674 

. 21 7850 

. 4 6649 

8 

. 391 

. 322 7344 

68 2656 

2 9 . 

1 6977 

. 2-8823 

. 1 6985 

9 

. 644 

. 456 4766 

187 5234 

. 26 0 . 

5 0939 

. 25-9476 

. 5 0965 

10 

590 

422 3750 

1676250 

. 21 0 . 

4 4999 

. 20 2490 

. 4 4817 

11 

. 287 

. 209 1797 

77 8203 

3 8 . 

1 9472 

. 3 7914 

. 1 9483 

12 

. 393 

. 285 6641 

107 3359 

. 7-5 . 

2 7396 

. 7 5052 

. 2-7417 

13 

. 213 

. 173-7422 

39-2578 

9 . 

9580 

•9178 

9591 


5,926 

. 4,328 7112 

. 1,597 2888 

. 159 8 . 

41 9177 

. 157 6917 

. 41-8895 


hstimatc of 177 '\/// 41 9176 41 8925 

Note . — Computations taken to 6 decimal places, reproduced here to 4. 


and the final values obtained. The following notes explain the entries in the columns. Numbers 
refer to columns. 

1 . Values of Dot are taken from the bottom row of Table 3. 

2. ^ei is used to indicate the sum of all values of atajfn for a given item. It is com- 
puted as OiilLa — ad In, from the appropriate entries in Table 2. 

3. On our assumption, the difference between T,Oi and Scj, shown in this column, is 
equal to npU i'^p' — p'i), the sum of all values of np'ip'j for a given item, (fn this notation 
^p' indicates the sum of all values of p', including p'i,) 



118 SLATFR—r/r^ Factor Analysis of a Matrix of 2 x 2 Tables [No. 1, 

The total of the column is equal to 

//(V/7')'^ ~ n{(p\Y 4 (p\Y 4 * . . (p'iY 4- . . . (p%uY] 
i.e. to the sum of all the terms in the matrix n[p'ip'j] under the condition, as before, that 

i ¥= /• 

4. A set of arbitrary values are entered in this column, which are guesses at the correct 
value of Mp'if. Their customary designation is “guessed communalities.” 

If the guesses are correct, the term for item I in this column, plus the corresponding 
term in the previous column, is equal to np'iLp\ and the total of this ^lumn plus the total 
of the previous column is equal to ni'^p'Y. Its square root is Sp'Vw. 

If the guesses are incorrect, the estimate of np'i^p' obtained from this column and 
the previous one, when divided by the estimate of ^pWh, will give a closer approximation 
to the correct value of than that implied by the guessed communality. 

5. This argument is used to compute the terms in this column, which arc first approxi- 
mations to the correct values of PtVn, The estimate of n is first calculated, then 
its reciprocal, and finally, the products of this reciprocal with the successive estimates 
of np'i'^p'. 

The sum of the column can be checked against the estimate of ^p\/n, with which it 
should agree. 

6. These are the squares of the terms in the previous column. 

7 and 8. The terms in column 7 are obtained from the entries in columns 3 and 6 by 
the same procedure as that used to obtain the terms in column 6 from entries in columns 
3 and 4. The squares of the terms in column 7 are entered in column 8. The two columns 
thus give second estimates of the factor loadings and the communalities. 

9. The third estimate of the factor loadings, shown in this column, which have been 
obtained by applying the same procedure to the entries in columns 3 and 8, are sufficiently 
close to the estimates that would be reached after further repetitions of the same procedure 
to be used as final estimates in this instance. 

The final estimates of p\\/n form a column matrix which can be treated exactly like a set of 
factor loadings. They can be used to calculate values of e'jj and hence of o,j e'{j, the differences 
shown in Table 5. 


Tablf 5. ~ Differences Between the Observed Frequencies (shown in Table 3) and the Frequencies 
Expected on the Assumption of a Single Genera! Factor 


Itriii. 

•J. 

4. 


(J. 

7. 

s. 

u. 

UL 

11. 

i-j 

\i 

1 

. 0 74 3-53 

•2-55 

~3 30 

3-27 

5-24 

-1 00 

3 59 

0-39 

2-97 

-0 65 

0-25 

2 

1 36 

1-32 

3 84 

3 38 

0 53 

-0 59 

2 39 

0-24 

1 22 

-3-63 

-0 84 

3 


1 73 

2 77 

-1 94 

-1 94 

2-73 

5 62 

2-71 

-0-67 

-5 05 

-2-71 

4 



8 63 

- 6-37 

0 77 

2 51 

-4 16 

8 06 

0 64 

1 -76 

4 91 

5 




14 34 

-4 12 

~0 64 

-0 26 

0 79 

0-35 

0 16 

-5*30 

6 





3 52 

3-58 

0 6.3 

0-64 

2-86 

-0 54 

-3 03 

7 






-1 70 

-2-97 

-3 20 

1 17 

4 33 

7-73 

8 







3 69 

0 90 

-3*45 

-0 12 

-5-92 

9 








-1 -51 

-3-64 

-2 98 

--0 41 

10 









- 1 45 

0 48 

2-18 

II 










3 51 

4-76 

12 











2-72 


Each is an average value. The method of computation ensures, and Table 5 can be used 
to confirm, that the sum of the deviations from each average is nil, i.e. that '^Oi — I>e'i ^ 0 
within the limits implied by the number of decimal places to which the computations have been 
carried. This also shows that one degree of freedom is used in calculating pWn ; that the total 
loss is equivalent to eliminating any one row and corresponding column from the matrix [Otj]; 
and that therefore, when x'i) is calculated by applying formula (1) with e'ij substituted for aj, 
the number of degrees of freedom of the resulting S/* will be i(m l)(m - 2). This has already 
been pointed out by Burt. 



1947] 


Slater — The Factor Analysis of a Matrix of 2 x 2 Tables 


119 


Thurstone's method has been used here because, when applied with successive approximations, 
it yields values of e'ij which satisfy this condition. (Usually the analysis is applied to [rij] and 
the condition satisfied would be written analogously as Sr* — ~ 0.) Other methods do 

not necessarily do so. But in every case when a general factor is extracted, one loading is obtained 
for each item, and therefore the number of degrees of freedom lost is equal to the number lost by 
eliminating one rdw and corresponding column from the matrix analy^. 

Table 6 shows the values of x*ij calculated to test how well our assumption fits the observations. 
Table 7 illustrates the worksheet used in calculating them. The following notes explain working 
methods and checks : 


Table 6. — Values of x* Used for Testing the Significance of the Differences 

Given in Table 5 



1. 

2 . a. 

4. 

r>. 


6. 

7. 

8. 


10. 

i 1. 

12. 

la. 

Total of 
row. 

1 



05 89 

•51 

■82 


•88 

2 11 

08 

•91 

01 

90 

•34 

01 . 

7-52 

2 


18 

•18 

1 -45 

1 

19 

03 

04 

59 

01 

19 

1 38 

•11 . 

5-33 

3 




•24 

60 


•32 

33 

•56 2 42 

•54 

05 

2 09 

•85 . 

8 00 

4 



— 

6 12 

3 

55 

06 

52 

1 -50 5 27 

04 

•27 

2 98 . 

20 32 

5 . 




— 

17 

95 

1-78 

03 

01 

05 

01 

00 

3-33 . 

23 16 

6 . 





- 

- 

1 56 

1 10 

04 

04 

88 

03 

117 . 

4 81 

7 







— 

•23 

•71 

•83 

18 

1 90 

7-87 . 

11-72 

8 








— 

1 01 

06 

1 26 

00 

4 41 . 

6-74 

9 









— 

17 

1 42 

•76 

02 . 

2-37 

10 










— 

21 

02 

•55 . 

•78 

11 












1 35 

3 62 . 

4 96 

12 

13 

Total of • 












— 

96 . 

96 















column . 
Total of 

— 

05 1 07 

•93 

8 99 

23 

90 

5 88 

2 55 

718 

6 97 

5 14 

8 14 

25 88 . 

96 68 

row 

7 52 

5 33 8 00 20 32 

23 16 

4 

81 

11-72 

6 74 

2 37 

•78 

4 96 

96 

- 

96 68 

Total of 
row plus 















column . 

7 52 

5 39 9 07 

21 25 

3215 

28 

•71 

17 60 9 29 

9-55 

7 75 

10 11 

9102588 . 

193 36 


Note . — Computations taken to 4 decimal places, reproduced here to 2. 


Table 7. — Worksheet Used in Computing on the Assumption 
of a Single General Factor 


raiH-a 



a. 

4. 

r>. 


7. 

H. 


itoins 




Ui e'ij 

Uj -- r'ij n 

rt, aj : p', 




1. 2 

. 18 9844 

4 2775 

23 2619 

66 7381 

30 7381 

135 2619 

0-7381 

0 5448 

0 0533 

1, 3 

. 35 1563 

9 3156 

44 4719 

45-5281 

55-5281 

110 4719 

3 5281 

12-4475 

0 8901 

1, 4 

. 27 0703 

7-4752 

34 5455 

55 -4545 

42-4545 

123-5455 

-2-5455 

6-4796 

0 5095 

1. 5 

. 30 2344 

9 0664 

39 3008 

50 6992 

46-6992 

119 3008 

- 3 3008 

10-8953 

0-8168 

1. 6 

. 24 6094 

7 6603 

32-2691 

57 7303 

37-7303 

128 2697 

3-2697 

10 6909 

0-8832 

1. 7 

. 53 0859 

9 6742 

62 im 

27 -2399 

88 2399 

77 7601 

5-2399 

27 4566 

2 1097 

1> 8 

. 28-4765 

3-5225 

31 9990 

58 0010 

49 0010 

1 16 9990 

0 9990 

0 9980 

0 0773 

1, 9 

, 41 8359 

10 5697 

52 4056 

37-5944 

66 5944 

99 4056 

3 5944 

12 9197 

0 9142 

1. 10 

. 38-3203 

9 2930 

47-6133 

42-3867 

61 3867 

104 6133 

0 3867 

0 1495 

0 0105 

1. 11 

. 17 9297 

4 0406 

21 9703 

68 0297 

29 0297 

136 9703 

-2 9703 

8 8227 

0 8996 

1. 12 

. 24 9609 

5 6861 

30 6470 

59 3530 

40 3530 

125 6470 

0 6470 

0 4186 

0-3442 

I. 13 

. 14 7656 

1 9891 

16-7547 

73-2453 

25 2453 

140 7547 

0 2453 

0 0602 

0 0072 

Total 

355 4296 

82 5702 

437 9998 

642 0002 

573 0002 

1,418 9998 

0 0002 


7 5156 


Note.- -Computations taken to 4 decimal places, as shown here. 


I. First note down ^a~~ai. Then calculate aijn. Multiply this successively by 
Ug, a^ . . . and finally by (Sa ~ ^i). The sum of the previous products should check 
with this. When item 2 is considered, begin by subtracting a, from Scr ~ au and note 
this down. 



120 Slater — The Factor Analysis of a Matrix o/ 2 x 2 Tables [No. 1, 

2. Similarly note down ^p'\/n — p\\^n. Then enter p\yjn on the machine, calculate 
products and check as for column 1. For item 2, use ^p'\/h - p\\f n ~ p\\fh^ etc. 

3. The total of this column is equal to the sum of the two previous totals. 

4. The totals of columns 3 and 4, added together, (m ~ l)tf i ; when item 2 is con- 
sidered, (m Doi, etc. 

5. The totals of columns 3 and 5, added together, = - at', when item 2 is con- 

sidered, ^ ^a a I (li, etc., as used in checking column 1 . 

6. The totals of columns 3, 4, 5 and 6, added together, = n(m — 1); when item 2 is 
considered, -= nim - 2), etc. 

7. The total of this column is equal to the row total for item 1 (cf. Table 3) minus the 
total of column 3. Use other row totals for other items correspondingly. 

An alternative method for computing given by Lawlcy*, can be adapted for testing the 
significance of the differences Oij dj and should give approximately the same results, with 
possibly less labour. 

The fit is unsatisfactory. 21// ( ^ 96 68) is disturbingly large for 66 degrees of freedom. 
Separate sums of y/,; for / 4, 5, 6 and 1 3 are also disturbingly large for the 1 1 degrees of free- 
dom they may each be allowed ; and two component values, where i ~= 5, , 6, and where 
/ 7, j - 1 3, are outside the normal range of y/ with one degree of freedom. 

In conclusion, the assumption that all the items belong to a single general class and that there 
is no need to differentiate any subclasses fails to account for the exceptionally close association 
between items 5 and 6, or for the association between items 7 and 13, it also provides an inadequate 
account of the associations between item 4 and other items, and gives a generally bad fit. It 
should be rejected. 


5. The Extraction of a Modified General Factor and Two Group Factors 

Thurstone’s method could be used to continue the analysis. The matrix * e'tj] would 
be written in full by inserting the symmetrical half in Table 5. Rows and corresponding columns 
in it would be reflected in sign until the number of positive signs was maximized. From the 
resulting matrix a second general factor would be extracted. Further factors could be obtained 
by further repetitions of this procedure. Each repetition would ensure that the differences 
between the observed and the expected frequencies 0 when summed in a fresh way. Conse- 
quently the degrees of freedom contributed by one row and corresponding column of the matrix 
of observations will be sacrificed each time, and after a factors have been extracted will be 
left with i(/w x){ni v 1) df.-cf. Burt {op. cit.). 

As the terms in the matrix used in extracting the second factor are deviations from the averages 
computed when the first factor is extracted, the loadings obtained are average deviations. Those 
obtained from reflected rows are given negative signs, so generally about half the loadings are 
positive, and half negative. Each further column of loadings possesses similar characteristics. 

Thurstone and his followers have provided methods of “rotating axes,’' i.e. of finding trigono- 
metric transformations of the loadings in these columns. The positive and negative signs and 
the size of the loadings in different columns (provided there are two or more) can be altered in 
an infinite number of ways without changing the expected values in the matrix which is their 
end product. These methods demonstrate that any expected matrix can be derived from an 
infinite number of different assumptions. They are used, however, to find preferred solutions 
in which the number of zero or near-zero loadings is maximized and the number of negative 
loadings is minimized. Burt gives reasons for preferring not to rotate axes, and attributes the 
columns of half positive, half negative loadings to “bi-polar“ factors. After rotation, conditions 
of the kind - 0 are not. necessarily satisfied by the loadings in any single column. 

If factor analysis is used for classificatory purposes, this method is an inconvenient one. The 
extraction of one factor is relevant for considering whether the variables or attributes fall into a 
single class; two factors for considering whether they fall into two approximately balanced 
classes; x for 2* “ ’ classes, where x can only be a positive integer. There arc thus a large number 
of hypothetical systems of classification to which Thurstonc’s method is inapplicable without 
trigonometric transformations. Other methods, e.g. Burt’s group-factor method or Holzinger’s 



121 


1947] Slater — The Factor Analysis of a Matrix of 2 x 2 Tables 

rpodif^tion of Spearman's, can be used to fill this gap, but may leave the same condition un- 
satisfied. 

In the present instance, the analysis in section 3 shows that the items cannot be treated, without 
qualification, as belonging to a single general class. Each seems to have a general tendency to 
occur concomitantly with every other, but some items appear specially frequently in conjunction 
with other particular items. Such items may be supposed to fall into one or more sub-classes 
within the general class. For any such item, '^Oi may be partitioned into, say, '^Oi,j i- 
where 5 indicates any other item in the same sub-class as /, and g any of the remaining items. 
Using^p\: to indicate the probability that item / will occur concomitantly with any other g item, 
and p^i to indicate its additional probability of occurring with any other s item, we can compute 
an average value of pWih which will satisfy the condition ^Oi,j - np\i^p' - '^p'^) - 0, where 
as before indicates the sum of all values of p'i, and ^p\ indicates the sum for all items in the 
same sub-class as 1 (including /). We can also compute an average value of p^is/^n such that 
np'i(Zp'^ ~ p'l) — np"i('Lpf ~ Pi") 0. The procedure used in the previous section 
can be adapted to these compulations. It could be adapted to deal with a great variety of more 
complicated assumptions, if necessary. 

Items 4, 5, 6, 7 and 13 are the ones which, on the evidence of Table 6, appear to need special 
sub-classification. Items 5 and 6 are specially closely associated with one another, but not with 
the remaining three. They may therefore be assigned a subclass by themselves. Item 13 is 
relatively closely associated with items 4 and 7, so these three may be assigned a second sub-class. 

On these assumptions the values of o,j shown in Table 8 


Tablf 8 


Hair of itonis. 


4, 7 

63 

4, 13 . 

21 

5, 6 

54 

7, 13 . 

37 

Total 

175 


must be omitted from the matrix when the values oi' p,\'n are computed. Table 9 shows 
the worksheet used and the final values of pWn obtained. Working methods are similar to 
those for Table 4, but the following notes may be useful. Numbers refer to columns. 


Tabi.e 9. Worksheet for Computing [p',Vn] Omitting Certain Values of o,, 


1 

1. 

'Z. 

t. Ob'*. 

I. Hirst c.st. 

Fiisf i st. 

() Siroiul I'M . 

1’ tn.il 1 st 




«-M\ unaiu-**s. 

<*omiii, 1 CON 

ioadiiiu. 

(Mnuni. (- cov. 

load 111)1 

1 

438 

355 4297 

82 5703 . 

4 3007 . 

2 1074 

4 4411 

. 2 1262 

2 

303 

220 8516 

82 1484 . 

4 2545 . 

2 0961 

4-3936 

. 2 1147 

3 

. 559 

3910156 

1679844 . 

20 1782 . 

4 5647 

20-8365 

4 6227 

4 

36l 

2499492 

112 0508 . 

33 2649 . 

3 5253 

. 30-7639 

3 4460 

5 

451 

317-4609 

133-5391 

35 2619 . 

4 0950 

307162 

3 9601 

6 

369 

. 258 3984 

1106016 . 

29 7929 . 

3 4059 

. 25-5473 

3 2799 

7 

634 

490 1602 

143 8398 . 

430506 . 

4-5338 

39 5647 

4 4236 

8 

391 

322-7344 

68 2656 . 

2 8850 . 

J 7261 

2-9794 

1 7405 

9 

644 

456-4766 

- 187-5234 , 

25-9764 . 

5 1794 

. 26-8262 

. 5 2512 

10 

590 

422 3750 

1676250 . 

20 0804 . 

4 5536 

20-7353 

46114 

11 

287 

209 1797 

77 8203 . 

3-7962 . 

1 9800 

3 9204 

1 9972 

12 

393 

2856641 

107 3359 . 

7-5176 . 

2-7863 

7-7635 

2 8137 

13 

155 

136 3359 

18 6641 . 

88515 . 

. -6675 

5 8250 

5740 


5,576 

. 4,116 0313 

. 1,459 9687 . 

239 2108 . 

41 2211 

. 224-3141 

. 40 9612 


1 . 21oiy is obtained from Table 3, omitting the values listed in Table 8. 

2. When / is a specially subclassified item, its correction term is aiil>a ~ where 

is the sum of the frequencies of occurrence of all items in the subclass including /. 



122 


Slater — The Factor Analysis of a Matrix of 1 ^ 2 Tables [No. 1, 


After Om - has been computed, the correction terms for all items in the subclass 

can be obtained by multiplying it successively by each appropriate value of ai, 

3. When 7 is a specially subclassificd item, the estimate needed for this column is that 
o( np\y:p\. ^p'sVn is estimated by taking values from the last column of Table 4, and 
multiplied successively by the appropriate values of pWn given there. The entries in 
columns 3 and 4, added together, thus yield estimates of np'i^p* and as before, 

the only difference being that the values of Oij listed in Table 8 have been left out of account. 


Values of have now to be found for items 4, 5, 6, 7 and 13, from equations of the 

form 


np^iP") Oij -- atOjln — np'tp'j. 


For items 5 and 6 there is only one such equation, 


np\p\ - 14 4864, 

so p":, and p\ are indeterminate. In the final table of factor loadings (Table 10) p'\\/n has been 
put, for convenience, equal top''«v'/; = 3 8061. 


Table 10 . — Factor Loadings Obtained on the Assumption of a Modified 
General Factor and Two Group Factors 



V'alucs of p‘i\/ n 
for the ueneral 

Valiiea of p*, \/ n 
for the firat. Krou|> 

ValueK of p^i \/ 1 , 
for the accoiml tcrotip 

1 

factor Uf). 

2 1262 

factor (N,). 

fact^)r (.s*,). 

2 

2 (147 



3 

4-6227 



4 

3 4460 


1 2419 

5 

3 9601 

! 3-8061 


6 

3 2799 

3 8061 


7 

4 4236 


1 8829 

8 

1 7405 



9 

5-2512 



10 

4 6114 



11 

1 9972 



12 

2 8137 



13 

5740 


5 -1^9 


Computations taken to 6 decimal places, reproduced here to 4. 
n - 256. ^/n 16. 


For items 4, 7 and 13 there are three equations. 


ap'Up"-! 2 3383 
np^ip" - 6 3892 
and //p'' 7 p"i 3 - 9 6874 


from which the final loadings can be obtained directly, using the algebraic method employed 
by Spearman, viz.: 



2 3383 X 6 3892y 
9-6874 / 


I 2419 


etc. These too are entered in Table 10, 

Table 10 therefore gives the complete set of factor loadings derived from the hypothesis 
adopted in this section. The two sets of values of p''i\'n are of equal rank, since one does not 
have to be determined before the othet, so the same notation is used for both. But they are 
listed as separate columns, to show that they are mutually exclusive. Using k”], [sj and [jr*] to 
indicate the three columns of loadings, the expected frequencies of concomitance can be calcu- 
lated as 


le%)] - [g] [gY + [Sr] bJ' f M M' 



1947] 


Slater — The Factor Analysis of a Matrix of 1 x 1 Tables 


123 


under the condition / ^ The differences between the observed frequencies and these expecta- 
tions are listed in Table 11. Values of calculated as before, are shown in Table 12. Table 13 


Table 1 1 . — Differences Between the Observed Frequencies (Shown in Table 3) and the Frequencies 
Expected on the Assumption of a Modified General Factor and Two Group Factors 


Item. 2. 

;l 

4. 

r». 

6. 

1 • 

8. 

9. 

10. 

It. 

12. 

1:L 

I . 0 52 

3 02 

-2 40 

-2-65 

-2*58 

5-51 

-118 

3 00 

-0 12 

-3 18 

-0 94 

1 01 

2 

-1 87 

1 -47 

4 48 

-2 70 

0 79 

-0 77 

1 79 

-0 74 

1 02 

3 93 

0 07 

3 


1 99 

4 10 

--0 51 

-1 43 

2-31 

4 24 

~3 90 

- 1 15 

-5-74 

1 06 

4 



6 51 

-4-36 

0 00 

2-64 

-3-89 

8*32 

0 78 

1 95 

0 00 

5 . 




0 00 

-1-24 

-0 10 

1 23 

2 12 

0 96 

1 01 

-3-38 

6 





--0 80 

414 

2-24 

2 07 

3 50 

0 36 

-1*37 

7 . 






-1-48 

-2-42 

-2-69 

0 92 

4-67 

0 00 

8 







3 21 

0 49 

3 61 

~0 36 

-5 29 

9 








-2 88 

-4 20 

-3-78 

1-46 

10 









-1 92 

0 21 

-0 53 


M . 3 24 5-49 

!2 . 3-74 


Table 12. — Value of x* Used for Testing the Significance of the 
Differences Given in Table 11 



1. 

2. ;i. 

4. 


0. 7. 

ft. 9. 

10 

11. 

12. 

18. 

'Pot 111 of 
row. 

1 



03 65 

■45 

■53 

•55 2 32 

11 64 

00 

1 03 

07 

13 . 

6 50 

2 


- -34 

•22 

1 97 

•76 08 

06 -34 

05 

13 

1 61 

00 . 

5-56 

3 




•32 

1 30 

02 18 

40 1 41 

1 13 

14 

2-72 

13 . 

7-75 

4 



. _ 

3-44 

1 65 00 

57 1 30 5 60 

06 

•33 

00 . 

12 95 

5 




— 

00 15 

00 12 

•35 

09 

08 

I 39 . 

2 19 

6 





— 07 

1 48 46 

•37 

I -33 

01 

25 . 

3 97 

7 







•18 47 

58 

11 

219 

00 . 

352 

8 






- - -76 

02 

1 -38 

01 

3 58 . 

5-75 

9 







•62 

1 92 

I 24 

•25 . 

4 02 

10 







— 

38 

00 

03 . 

41 

11 








— 

114 

4 99 . 

6 13 

12 










1 -87 . 

1 -87 

13 










— 


Total of 












column . 

— 

03 -99 

99 

7 24 2-98 2 80 

2 79 5 50 

8 73 

6 57 

9 40 

12 61 . 

60 63 

fotal of 












row 

6 50 5 

56 7 75 

12 95 

2 19 

3 97 3-52 

5 75 4 02 

41 

6 13 

1 87 

— 

60 63 


Total of 
row plus 

column . 6 50 5 59 8 74 13 94 9 43 6 95 6 31 8 54 9 53 9 14 12 69 11-27 12 61 . 121 25 



Table 13.— 

-Degrees of Freedom 


XuuiImt of 

(>«) (iriuiiiHlh 

(6) Exhaus<t(‘il h\ 

(r) Eiiully 

item. 

availuhh*. 

llVIM^tllOSlK. 

avuiltibh*. 

4 

12 

2 

10 

5 

11 

2 

9 

6 

10 

2 

8 

7 

9 

2 

7 

13 

8 

2 

6 

1 

7 

1 

6 

2 

6 

1 

5 

3 

5 

1 

4 

8 

' 4 

1 

3 

9 

3 

1 

2 

10 

2 

1 

1 

11 

I 

1 

0 

12 

0 

1 

0 

Total 

78 

n.a. 

61 


n.a. not applicable. 



]24 Slater — The Factor Analysis of a Matrix of 2 k 2 Tables [No, 1, 

shows the working method used for counting the degrees of freedom for Xx*. ^n the list of items, 
those from which more degrees of freedom have been exhausted take precedence over others. 
Column {a) shows the number of degrees of freedom yielded by each row of the matrix of fre- 
quencies when the items are arranged in the order shown. Column ih) shows the number of 
sub-totals of ^Oi used in computing the factor loadings for each item. Column (r) 'shows the 
number of degrees of freedom left, i.e. the differences between corresponding entries in columns 
(a) and ib) excluding negative differences. 

The amount of S/" obtained from Table 12 accords with the amount expected by chance, 
so the hypothesis tits the observations as a whole well enough. The sums of for each item 
are also amounts within the range of chance, allowing 10 degrees of freedom each to those for 
items 4, 5, 6, 7 and 13, and 11 each for the remainder. The highest single value of x*jj» that for 
items 4 and 10, is no greater than might be expected to occur by chance in a table containing 
74 values, each of which, considered independently of the others, may be allowed one degree 
of freedom. 

It is therefore concluded that the hypothesis considered in this section provides a satisfactory 
fit to the observations. All the items, it seems, can be treated as belonging to a single class; 
but two sub-classes need to be differentiated, the first containing items 5 and 6 only, the second, 
items 4, 7 and 13. 


6. Interpretation and Validation of the Results 
The psychological discussion of these results is intended for publication elsewhere, but some 
comment may be desired here to provide a happy ending. 

The psychological resemblance between items 5 and 6 is evident. Tn the descriptive account 
furnished with the records, I find the almost apologetic comment: “Analysing our case histories, 
it did seem worth while to draw a distinction between (them) because one quarter of our neurotic 
cases appear to have overcome the adolescent shyness when they reached adult life.'”^ No 
specially close psychological resemblance was suggested between items 4, 7 and 13. But if men 
prone to neurosis are divided into those who are liable to phases of depression or other periodic 
I eductions in their resistance, and those of a permanently inadequate type, we may consider 
items 4, 7 and 13 pertinent in differentiating the former from the latter. Thus the classification 
provided by the statistical analysis is psychologically reasonable. 

Following the method used in the two previous sections, we may consider what is the cumulative 
value of the information as a means of differentiating neurotics from normals, firstly when an equal 
weight is assigned to all the items, and secondly, when items in the same sub-class are weighted 
equally but each sub-class is allowed a separate weight. In the first case, we may analyse the 
total variance of the number of items per person: we find then that 35 per cent, is accounted for 
by the difference between the neurotic group and the control. In the second, we may separate 
the total number of items per person into sub-totals for each of the three sub-classes ; A, items 
5 and 6; B, items 4, 7 and 13 ; and C, all others. Treating each sub-total as a separate variable, 
we may calculate the discriminant functions for the same purpose. The appropriate weights are 
then found to be in the proportion. 

Items in sub-class A, 0585 
B, I 4850 
„ „ C, 1 0000 

and 45 per cent, of the total variance of the weighted sum per person is attributable to the difference 
between the two samples. Thus the classification reached in Section 4 is a useful one. 


7. Notes 

i. The analysis of derived coefficients (tetrachoric correlations^ etc .). — When factor 4oaJ|ysis 
is applied to data collected in the form of matrices of 2 x 2 tables, the observations for which the 
preferred hypothesis should account are the observed frequencies of concomitance. Any 
method of factor analysis may be applied, as here, directly to these observed frequencies, or to 
functions of them, such as four-point correlation coefficients or tetrachorics. But if it is applied 



1947] 


Slater — The Factor Analysis of a Matrix of 2 x 2 Tables 


125 


to any derived function, two extra stages of work are likely to be involved; firstly, deriving the 
functions for analysis, and secondly, reconstructing the matrix of expected frequencies for com- 
parison with the observed frequencies. Apart from the risk of error introduced by unnecessary 
computations, the expected frequencies are likely to differ from the observed more widely if these 
two stages are introduced. For instance, if tetrachoric correlations had been used here, a matrix 
of them, say [/y], would need to be calculated from Let [/i] be the column matrix of the 

first factor loadings obtained, as in section 3, from [/ij], corresponding with the entries in the final 
column of Table 4. Then the sum of every row and every column in the matrix of differences 
^ under the condition / ^ j. But the values of e'ij reconstructed from 
Ui] UiY will not necessarily be such that the matrix of differences [o/j] -[e'ij] will satisfy the same 
conditions. But these are the differences which should be considered in testing goodness of fit. 
So the hypothesis may be more badly fitted to the observations than it need be. To these general 
objections to using any derived function whatever must be added any objection to the use of any 
particular derived function. 

The product moment correlation between two continuous, approximately normally distributed 
variables can generally be estimated within the limits of its error variance by tetrachoric correla- 
tion, provided that the dichotomies are taken near the means and the number of cases is not small 
(say, not under 100). But it is when the number is as large as several thousand, and when facilities 
for tabulation and computation are limited and the need for information is urgent, that tetrachoric 
correlations prove most useful. In this instance the number of cases may not be so small as to 
preclude the use of tetrachorics, but there are no good reasons for using them. The reasons 
against are strong. The gravest, it seems to me, is that the definition of many of the items is 
dependent on the point at which the dichotomy occurs. Thus the enquiries about ‘'heredity" 
(item 1) took only parents and sibs into account. A graduated scale for measuring variations in 
hereditary predisposition to neurosis would need to be based on more exhaustive enquiries, and 
would thus lead to a rc-definition of “ heredity.” The use of tetrachorics here thus involves us 
in speculations on what inferences might have been drawn if the same data had been obtained by 
different methods of observation, when we have good reasons to suppose that different methods 
would have yielded different data. 

ii. Analogies between factor analysis and analysis of variance . — The similarity between a factor 
loading and an average has, I hope, been shown in the above account. Other similarities between 
factor analysis and analysis of variance are mentioned here in the hope that they may suggest 
other approaches to the problem of testing goodness of fit and significance of factor loadings. 
They are even more apparent when continuous variables are considered in place of dichotomous 
attributes. 

Let us therefore suppose that // individuals have been measured in m variables. Let / and 
J be any two variables. Let Vi indicate the total variance of the measurements in /, Vj that in 
J and Wij their total covariance. Factor analysis may be applied to the matrix or to 

matrices of derived coefficients. Usually it is applied to matrix of 

mean covariances when each variable has been weighted by the reciprocal of its standard 
deviation. 

In [Wij] the terms in the leading diagonal (where / - j) are omitted. All methods of factor 
analysis can be described as methods for interpolating a set of terms in this leading diagonal which 
conform with certain requirements and are described as communalities. The total communality 
of ! is the sum of the squares of its factor loadings (the factor loadings being independent of one 
another). It can be discussed as a proportion of the total variance of /. As the number of 
factor loadings attributed to / increases, this proportion tends to increase. It might be possible 
to consider whether the increase due to the extraction of an additional factor is significant. 

Again, let — 

~ the sum of all the terms in 
== the sum of Vi over all values of /, 

Vi — the total variance of the n totals obtained by summing all the m measurements on 
each individual ; 



126 


Slater — The Factor Analysis of a Matrix of 1 2 Tables [No. I, 

then Vu and the sum of all the terms in all the matrices 

[/il [/i]' + lU] lUY + . . . 

(not subject to the condition / /) 

can be discussed as a proportion of yt» The proportion tends to increase as the number of factors 
increases, and the significance of such increases might be considered. 

If the analysis is applied to [r,j], let Sr --- the sum of all its terms. Then i^r -\- m — the mean 
square variance of the totals obtained by summing all the m measurements on each individual 
after each variable has been weighted by the reciprocal of its standard deviation. Therefore 
the factor analysis can still be considered as an analysis of the variance of a definable composite 
variable. 

However, certain difficulties are encountered when we explore this approach further. Return- 
ing to the analysis of [IVij], let 

the sum of all the terms in the matrix [/i] [/i]' not subject to the condition / j. 
ilci the sum of its communalities, i.e. of the terms in its leading diagonal, where / -- 7 , 
and let 

etc., be defined similarly. 

Then the ratio ^li/yt varies with the method used for computing the first factor loadings. When 
Thurstone's method is used, as applied in section 3 and described at the beginning of section 4, 
before rotation ^ but Ilf 2 0, etc. The proportions etc., there- 

fore do not seem comparable with '^lylVi. This seems likely to be the case, also, if after one general 
factor the remaining factors extracted are group factors. 

Again, let us suppose that two groups of people, X and K, have been measured in the same m 
variables, and that two matrices of covariances, [Xij\ and [K,j], have been computed. We may 
wish to consider whether the first factor loadings obtained from [A',] differ from those obtained 
from [T,,]. Following the suggested approach, we might put — 


[A'„] I [r,J 


(so that F/ becomes the total variance of the composite variables within groups) and make a 
factor analysis of all three matrices, thus obtaining three sets of loadings, say [/J, [/^J and [/,]. 
We might expect that if [/,.] differs significantly from [/^,], 11/,, f 11/,/ will form a noticeably larger 
proportion of Fi than H/i. But in an unpublished experiment of this kind I found that H/, 
exceeded U/^ 1 ll/,,. 

I conclude that there are pitfalls in this approach to the problem, but I am unable to say 
whether it should therefore be abandoned. Burt’s discussion of this subject should be noted. 

iii. Conditions affecting the definition of factors , — In the above experiment the matrix analysed 
was [o,j] [cij] where Cij = OiUjIn, a single correction term which does not allow for the fact 
that the n cases are drawn from two groups. For instance, c 1 . 2 ^^90 x 54/256 (data from Table 
2). The matrix used is therefore that of the total covariances of the items. Why was the 
analysis not applied to the matrix of covariances within groups ? Why was e,j not taken as the 
sum of two correction terms, one for each group? Why, for instance, was et .2 not taken as 
83 X 54/201 I 7 x 0/55 ? Let us consider how the adopted treatment affects the definition of 
the general factor, and how its definition would be affected by this alternative treatment. 

Each of the items was selected for inclusion in the matrix because it was more frequently 
observed among the neurotic officers than among the normal. Consequently, when the frequencies 
of concomitance were considered, a general tendency towards positive association could be expected 
to occur. This can be attributed to differences in the men’s degree of propensity to neurosis. 

If the matrix analysed had been that of the covariances within groups, the same general tendency 
might still be expected to appear, because the degree of propensity must vary to some extent among 
different members of the same group, and the items can be expected to be sensitive to such varia- 
tions. But the general tendency would be much less marked ; the general factor loadings would be 
smaller and the grounds for defining the general factor would be less secure. 

When we consider to what an extent the results of this analysis might have beeh modified by 
varying either the items selected for inclusion or the conriposition of the group under observation. 



7] Slater — The Factor Analysis of a Matrix o/ 2 x 2 Tables 127 

we cannot help doubting, 1 feel, whether any factor obtained from any matrix of covariances or 
dependent functions of covariances can be defined without reference to both. But this leads us 
to doubt whether the general factor di^overed under one set of conditions can be identified with 
that discovered under another. That is to doubt a proposition which has been used as a premiss 
in many psychological arguments. 


References 

^ Holzinger, K. J., and Harman, H. H. (1941), Factor Analysis, a Synthesis of Modern Methods. Univer- 
sity of Chicago Press. 

* Burt, C. (1940), The Factors of the Mind. University of London Press. 

® Thomson, G. (1946), The Factorial Analysis of Human Ability. University of London Press. 

* Spearman, C. (1932), The Abilities of Man. McMillan & Co. 

* Kelley, T. L. (1928), Crossroads in the Mind of Man. Stanford University Press. 

* (1935), Essential Traits of Mental Life. Humphrey Milford, London. 

’ Thurstone, L. L. (1935), The Vectors of Mind. University of Chicago Press. 

* Lawley, D. N. (1943), “The Application of the Maximum Likelihood Method to Lacior Analysis," 
British Journal of Psychology, 33, iii, 172-175. 



128 


Rao — Factorial Experiments Derivable from 


{No, 


Factorial Experiments Derivable from Combinatorial Arrangements of Arrays 

By C. Radhakrishna Rao 
(King’s College, Cambridge) 

PAGE 


1. Introduction 128 

2. Arrays of Strength “d" 128 

3. Construction of Arrays 129 

4- Confounded Designs for Symmetrical F‘actorial Experiments . . . . 132 

5. Designs for Asymmetrical Factorial Experiments .. 132 

6. General Symmetrical Factorial Experiments .. .. .. .. 134 

(a) Contrasts Defining Main Eifects and Interactions .. .. .. 134 

(h) A Class of Subsets from which Desired Interactions are Measurable 135 

(c) Designs for Multifactorial Experiments .. .. 135 

(d) Multifactorial Experiments with Groups of Assemblies .. 137 


1. Introduction, 

In a paper (Rao, 1946) the author defined certain configurations of arrays called hyper- 
cubes of strength and applied them in the construction of confounded designs for factorial 
experiments. These hypercubes can be constructed with the help of configurations of points 
and planes in finite Projective and Euclidean geometries. It has been shown that — 

(i) a system of confounded designs involving the maximum number of factors and 
preserving main effects and interactions up to the order (d - 1) can be constructed in the 
case of a symmetrical factorial experiment when a hypercube of strength “t/” exists; and 

(ii) hypercubes of strength 2 supply confounded designs for asymmetrical factorial 
experiments defined by Nair and Rao (1941, 1942u, l942/>). 

In this paper the general configuration of arrays of strength ‘V/” which supply basic com- 
binatorial arrangements leading to designs for factorial experiments involving simple analysis of 
results have been defined and some methods of construction discussed. Some of the problems 
considered are the constructions of 

ia) multifactorial designs similar to those of Plackett and Burman (1946), but leading 
to the estimation of main effects and interactions up to the order k when interactions of 
order equal to and greater than d{ A) are absent, 

(6) block designs for symmetrical factorial experiments involving only a subset of 
treatment combinations and preserving main effects and interactions up to a given order 
when higher order interactions are absent, and 

U) a new series of asymmetrical factorial designs derivable from arrays of strength 2. 

This method leads to possible arrangements of multifactorial designs of the type introduced 
by Plackett and Burman (1946) when the number of levels need not necessarily be a prime or a 
prime power. For instance, in the case of factors each at 6 levels a multifactorial design exists 
with 36 assemblies involving the maximum of 3 factors, and in the case of factors each at 12 levels 
a multifactorial design with 144 assemblies can include at least 5 factors. 

The existence of block designs leads to arrangements of fractional replication in the case of 
symmetrical factorial experfments. They give rise to multifactorial designs arranged in blocks 
so that a source of variation affecting groups of treatments can be eliminated. The general 
theory of fractional replication is treated in a separate communication where the nature of 
arrangements introduced here is studied from the point of view of algebraic groups. 


2. Arrays of Strength d 

Let there be n factors Au At A,u each of which can assume s values, those corresponding 

to the i-th factor being represented by iu h, . . . /«. An ordered set 

l<i> 2(, . . . Wjj. 

may be called a combination or an assembly. This assembly can, without any ambiguity, be 



f947] Combinatorial Arrangements of Arrays 129 

represented by (a b » . . k). There arc altogether assemblies, of which a subset of N assemblies 
may be caHed an array and represented by (N, s). This array is said to be of strength d if all 

sA assemblies^aqrresponding to any d factors chosen out of n occur an equal number of times. 
An array of strath ,d is represented by (Ny /i, Sy d). This array, when N is of the form S*^y has 
been termed as a hypercube of strength d in (Rao, 1946). 

In a later section it has been shown that the following relationships hold among the para- 
meters defining an array of strength d: 

N ^ where X is an integer, 

N - \ > "Cl (s — 1) -f- "C2 (s - 1)^ . . , (s — 1)K when d is even, 
and 

N ~ 1 > "Cl Cs - 1) -I- . . . f - ,) {S - Diid ^ 1 ) 4 . n - ^ + 1 ,^ 

d is odd. 

Some important problems are the constructions of these arrays with the optimum values of n for 
a given d and N. 

When d - 2 and N -- s* the array co-exists with {n — 2) mutually orthogonal latin squares, 
the levels of the last two factors corresponding to the rows and columns and those of the /-th 
factor to the letters of the /-th orthogonal square. 

When d 1 and N v/c get arrangements corresponding to orthogonal hypercubes 
(Fisher (1946), Kishen (1942)). The representation of these hypercubes as a set of assemblies is 
convenient as no geometrical configurations such as cells of a cube, etc., are used. 


3. Construction of Arrays 

A general method of constructing these arrays of strength d when N = where j is a prime 
or a prime power has been discussed in (Rao, 1946). Some special methods of construction are 
discussed below and practically useful arrangements are listed. 


Arrays of the Form (s*, n, s, 2). 

These arrays are derivable from {n - 2) mutually orthogonal squares of order so that the 
maximum value of n is two more than the number of mutually orthogonal squares of order s. 

Superimpose all the {n -- 2) orthogonal squares so that each cell of this composite square 
contains {n - 2) ordered values. The elements of the /-th square arc made to correspond to 
the levels of the /-th factor. Border this square with elements 1,2,. . . s for the rows and 
1,2. . . for the columns. These elements can be made to correspond to the levels of the 
(// - l)-th and n-ib factors. Each cell together with the bordered elements gives an ordered 
set of n values and the sets arising out of s^ cells give the array (5*, s, 2). It follows that when 
s - 6, the maximum value of /? — 3 as there exists no Graeco-Latin square, and when s = 12, 
n is not less than 5 as it is known that there exist at least 3 orthogonal squares of size 12. 

Given an array (s^ n, 2) we can get arrays for factors less than n by omitting the levels of 
some factors. The method is illustrated below when 5 — 4. The 3 orthogonal squares with 
bordered elements are — 



1 

2 

3 

4 

1 

111 

222 

333 

444 

2 

; 234 

143 

412 

321 

3 

i 342 

431 

124 

213 

4 

423 

314 

241 

132 


SUPP. VOL. IX. NO. 1. 



130 


Rao — Factorial Experiments Derivable from 


[No. 1. 


The arrangement (16, 5, 4, 2) is 

; 1 1 1 1 1 i 234 1 2 ; 342 1 3 , 423 1 4 
22221 ! 14322; 43 123 3 1 424 

It!' 

133331:4123212433 24134 

! 4444 1 I 32 1 42 ' 2 1 343 1 3244 

By omitting the levels of any one and two factors we get arrangements (16, 4, 4, 2) and 
(16, 3, 4, 2) respectively. 


Arrays of the Form (2'', /i, 2, d). 

Let us represent the two levels of a factor by F and — and write down all the 2' combinations 
of r factors Fu F 2 , . . . Fr each at 2 levels. These reprCwScnt r columns of the array. A column 
obtained by multiplying the signs occurring in a row from the /-th, >th, . . . columns is repre- 
sented by Fi Fj. . . . The 2' — 1 columns generated by 

Fid 1,2 r); 1,2 rj^j); . . . ; F, F, . . . F, 

satisfy the properties of an array of strength 2. This is the maximum possible number of 
factors. Hence when = 2 the highest value of n for which the array can be constructed 
is 2' - 1 . 

To get arrays of strength 3, we have to include besides Fj, F^, . . . F^ combinations involving 
only odd numl^r of selections. This gives the number of factors as 2''"^ which is the maximum 
possible as shown by the latter inequality above. Thus when cl = 3, the maximum n is 2^~L 

To get arrays of strength 4, we have to include besides Fi, Fa, . . . F^, the columns listed in 
Table 1 below in special cases. 


N 

2* 

2 « 

2 « 

2 ^ 


Table 1 . — Arrays of Strength 4 

Columns to be rcfa,lmMj beaiilcB F,, . . . 

F, Fa Fa F4 
F, Fa Fa F4 F, 

F, Fa Fa F4, Fa F3 F, F« 

F, Fa Fa F4, F, Fa F, F«, Fa Fa F, F„ F, Fa F, F, F, F, F, 


As examples we may construct arrays of the form (2®, /?, 2, </). Thus, for example, in Table 2, 
following, is given the array (2®, 7, 2, 2). 


Arrays of the Form (s'^, n, s, d). 

When 5 ^ 3 or 5 the levels of a factor can be represented by the residue classes mod and 
when s = 4, the levels of a factor are represented by 4 elements, 0, 1, a and a® satisfying the relation- 
ships 1 ~ a 4* a®, 1 + a = tt® 1 -f a® = a. As before we take r factors F^, Fa, . . . F^ and 
write down s^ combinations using these values. A column obtained as a linear combination of 
the elements in the rows from the first r columns is represented by Fi 4- . . . + F,., where 

Xi, . . . X, are the compounding coefficients. The column obtained from X^ F^ -f . . . + X^ F,. 
is the same as that obtained from oXiF, -f- . . . 4- oX^F, except for a permutation of the 
elements. There are -• \)l(s — 1) linear functions which give rise to (s^ — \)/(s — 1) factors 



1947] 


Combinatorial Arrangements of Arrays 


131 


Table 2. — The Array (2*, 7, 2, 2) 


Fr 

Fn 

Fm 

Ft F, 

FtF, 

FnF, 

Ft F, F, 

+ 

+ 

4 - 

+ 

+ 

4 - 

4 - 

+ 

4 - 

- 

4 * 

- 

- 

- 

4 - 

- 


— 

4 - 

- 

- 

“f 

- 

- 

- 

- 

+ 

4 * 


4 .- 

4 - 

— 

— 

4 - 

- 

- 

4 - 

- 

- 

4 - 

- 

+ 

- 

- 

4 - 


- 

- 

4 - 

— 

— 

— 

4 * 

4 “ 

4 - 

— 


The array (2^ 4, 2, 3) is obtained by omitting the columns Fi Fa, Ft F3 and Fa F3. 


Table 3. — Hypercubes of Strength 3 

N Columns to be retained besides Fu • Ff, 

3® Ft + Fa -h F 3 

■ 3 * ‘ F, + F, + F„F, + F, + F„F^ + F, + F„F, + F, + F, 

4» ' F, + Fj + F„ Fi + aF* + ot>F„ Fi -1- a»F, + aF, 

5» 1 Fi + F, -(- F„ Fi h 2F, + 3F„ F. + 3F, + 4F, 

The only useful hypercube of strength 4 is when H 3* and s = 3, in which case the column 
to be retained besides Ft, Fa, Fa, F* is Ft + F* -f F, f Fi. As an example the array (3», 4, 3, 3) 
is given below in Table 4. 





Table 4.- 

—The Array (3*, 4, 

3, 3) 




Ft 


F, 

XF 

Ft 

Ft 

Ft 

2 F 

Ft 

i^a 

i'"s 

2 

0 

0 

0 

0 

1 

0 

0 

1 

2 

0 

0 

2 

0 

0 

1 

1 

1 

0 

1 

2 

2 

0 

1 

0 

0 

0 

2 

2 

1 

0 

2 

0 

2 

0 

2 

1 

0 

1 

0 

1 

1 

1 

0 

2 

2 

1 

0 

0 

0 

1 

1 

2 

1 

1 

1 

0 

2 

1 

1 

1 

0 

1 

2 

0 

1 

1 

2 

1 

2 

1 

2 

2 

0 

2 

0 

2 

1 

2 

0 

0 

2 

2 

0 

1 

0 

2 

1 

0 

1 

2 

1 

1 

2 

2 

1 

2 

0 

2 

2 

1 

1 

2 

2 

2 

2 

2 

2 

0 



132 Rao — Factorial Experiments Derivable from [No. 1, 

for an array of strength 2. To get arrays of strength 3 we have to retain the columns shown in 
Table 3 besides Fi, Fa, . . . F, in these special cases. 


4. Confounded Designs for Symmetrical Factorial Experiments 

If all the blocks in a confounded design for a symmetrical factorial experiment are arrays of 
strength d, then all main effects and interactions up to the order {d — \) are preserved. These 
designs can be derived from the arrays constructed in the previous section in a simple manner, 
and they include the maximum possible number of factors. Designs with lesser number of 
factors can be derived from them by omitting the levels corresponding to some factors. 

Starting with the array (s***, n, s, constructed in the previous section, we can construct 
another array of strength d by the • following procedure : {a' b' is an assembly not 

occurring in this array, then the array, constructed out of the sets (a -\- a\ h b' . . . ), where 
{ab . . .) are sets of the given array, is of strength d. ^^artlfig with an assem^W wh^ch has 
not occurred in the first two we can derive a third array and so on. There are s^ different 
arrays that can be constructed in this way and they cover all the a" rombir»«tio* * The law of 
addition when s == 2, in which case the levels have been represented by -}- and corresponds 
to multiplication, i.e. is — 

(+) f (4-) - (4 ); ( -) 4- (-) - (4-); (+) 4- (-) - (-). 

This gives us a confounded design for .v” symmetrical factorial experiment in blocks of s*^ plots 
preserving main effects and interactions up to the order (d -- 1). A list of useful designs is given 
in Table 5. Designs are not written out in full, as they can be easily derived from the method 
discussed above. The method of analysis is fully discussed in Rao (1946). 


Table 5. — Confounded Designs for Symmetrical Factorial Experiments 


J.cvcIh of n 

Block 


Maximum number of 

i'uctor. 

.size. 

d. 

factors possible. 

S 


2 

. (A»«-^ 1)/(A’-*I) 

(Prime or 
prime power). 

2 

2m 

3 



2* 

4 

5 


2* 

4 

6 


2« 

4 

8 

3 

3« 

3 

4 


3* 

3 

8 


3^ 

4 

5 

4 

4^ 

3 

6 

The designs for d - 2 are given in Fisher (1943 and 1945). 


The possibilities of constructing confounded designs when the block size need not necessarily 
be a power of the levels of a factor will be discussed in a subsequent communication. They 
depend on the existence of these arrays for values of N other than powers of the levels of a factor. 

* 5. Designs for Asymmetrical Factorial Experiments 

The combinatorial problem, methods of analysis and a list of useful designs are given in 
(Nair and Rao, 1946). The combinatorial problem leading to what has been called a two- 

dimensional factorial design is as follows : If there are two factors A and B at levels c and m, 
then c X m combinations can be represented by a; 6^ (i = 1, 2 . . . c, y = 1, 2 . . . w). An 



1947] Combinatorial Arrangements of Arrays 133 

wangement in b sets of k combinations each is called a two-dimensional factorial design c x m 
if in the totality of the sets, 

(i) each combination is used an equal number of times, and 

(ii) the combination oibj occurs with a^bg in sets i ^ h, J g; Xip sets if 

I h,J g; and Xpi sets if i >-= Kj^g and Xpo = 0 sets if / = /i, y = g. 

If such an arrangement exists, we can construct designs forcxm in blocks of size k and 

c X m* in blocks of size km (Yates, 1935). We need replace the w* combinations of two factors 

by m sets of m combinations each such that they involve (m — 1) comparisons belonging to the 
interaction of these two factors. 

If the array (A', /?, 5, 2) exists with N ^ X.s*, then identifying the n factors with the levels of a 
factor A and the s values with the levels of a factor B we get the two-dimensional design n x s 
with N blocks of size n and Xj i Xio = X, X^i = 0. 

Since the properties of an array are retained by a permutation of the values of any factor or 
factors, we can get an arrangement of (A^, /i, s, 2) such that some of the factors have the values 
1, 2, . . . s respectively for some s assemblies. Let the number of such factors be g. Then 
omitting the rest of the factors and these s assemblies, we get a derived configuration involving 
g factors and (N — s) assemblies. Taking the g factors as levels of A and s values as levels of B 
and an assembly as a block we get a two-dimensional factorial design g x s with (A/' — s) blocks 
of size g and X^^ X, Xio“ X — 1, Xoi -=0. The method is illustrated below. Plackett and 
Burman (1946) have given the array of Table 6. 


Table 6,— The Array (12, 11, 2, 2) 

Levels of the first factor. 




7 

1 

2 

8 

3 

4 

5 

9 

10 

11 

6 


" 1 

+ 

+ 

— 

-L 

4 - 

4- 

- 

- 

- 

4- 

- 


2 


- 

4- 

4- 

+ 

- 

- 

- 

+ 


+ 


3 

- 


4- 

4- 

- 

- 

- 

4- 

- 

4- 

4- 


1 

4 



4- 

- 

- 

- 

4- 

- 

4" 

4- 



5 

-t- 


- 

- 

- 

4- 

- 

4- 

4* 

- 

4- 

Blocks 

6 

-f 

- 

— 

- 

4- 

- 

4* 

4- 

- 

4- 

4- 


7 ! 


— 

— 

4- 

•— 

4- 

-f 

— 

4- 

4- 

4- 


8 


- 

4- 

- 

4- 

4- 

- 

4- 

-h 

4~ 

- 


9 

- 

+ 

— 

4- 

4- 

- 

4- 

4- 

4- 

- 

- 


10 

-i- 

- 

4- 

4- 

- 

4- 

4- 

4- 

- 

- 

- 


11 

- 

4- 

4- 

— 

4- 

4- 

4* 

- 

- 

- 

4- 



- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 


The values inside the table correspond to the levels of the second factor at 2 levels. This 
supplies the design for c x 2 or c x 2* when c < 1 1 utilizing only 12 blocks with X^ — Xio 3, 
Xox = 0. By omitting blocks (11) and (12) and the factors 7, 8, 9, 10 and 11 we get a derived 
configuration giving the designs for 6 x 2 or 6 x 2* with Xii = 3, Xio — 2, Xoi==0 and only 
10 blocks. Since designs for c x 2 or c x 2* can be obtained from (8, 7, 2, 2) we get the possible 
types of designs shown in Table 7 for c x 2 or c x 2»»* in general. 



134 


Rao — Factorial Experiments Derivable from 


[No. 1, 


Table 1, -—Types of Designs for c x 2^ 


Value of c. 

Number of 
blocks. 

^11 



c < 11 

12 

3 

3 

0 

c < 6 

10 

3 

2 

0 

c< 7 

8 

2 

2 

0 

c < 4 

6 

2 

1 

0 


Some other types of designs arc listed in (Rao, 1946). 

6 . General Symmetrical Factorial Experiments 
a) Contrasts Defining Main Effects and Interactions 

The a” assemblies arising out of n factors can be represented as (« . k) where each letter 

can assume values 0, 1, . . . (5 — 1). The assembly (0 0. . .0) which is a combination of all 
factors at the first level is called the nominal assembly. Denoting by 

I" Woa /o + . . . + - 1 a h - i 

such that £ mja = 0 when a ^ 0 

j 

S mja mji, - 0 when a b 
i 

and nija — 1 for all J when a — 0 

/o, /i, . . . in -I representing the s levels of the /-th factor, we can define symbolic products of 
the form 1*2^ , . . //* which may be represented by [a /> ... A] to distinguish it from the 
assembly {ab . . . k). 

Since [ab . . . /r] is a linear function of the assemblies we may consider orthogonality and 
independence of these functions in the usual manner. There are functions of this type which 
are all orthogonal and hence independent. The .v” assemblies can then be, alternatively, 
expressed as linear functions of these functions. 

A function [a b , . .A:] is called a contrast if all the values are not simultaneously zero. The 
contrast belongs to the interaction of factors for which the corresponding values are not zero. 
The main effects may be considered to be one factor or zero order interactions. There are 
{s ~ 1)^ orthogonal functions defining the interaction of t specified factors. Any linear function 
which defines a component of the interaction of these factors must necessarily be built out of 
these functions. Hence we get the necessary and sufficient condition that a linear function of 
the assemblies belongs to or defines a component of a certain interaction is that it is orthogonal 
to every function of the type [ab , . . k] except those defining that interaction. 

When some of the interactions are not present, i.e. when the parametric functions defining 
them are identically zero, then any linear function of the contrasts defining them can be added 
to a contrast belonging to any other interaction without altering its value. Hence we derive the 
necessary and sufficient conditions that a linear function of assemblies measures a component of 
an interaction when some of the interactions are absent as 

(i) it is orthogonal to [0 0 . . , 0], 

(ii) it is not orthogonal to at least one function of the type [ab , . .A] defining the 
specified interaction, 

(iii) it is orthogonal to functions [ab . . .A] defining interactions other than those 
that are absent and the specified one. 



1947 ] Combinatorial Arrangements of Arrays 135 

It may be possible for linear functions to exist satisfying (i), (ii) and (iii) above, but depending 
only on a subset of assemblies, so that contrasts defined by these functions can be measur^ from 
a subset of the assemblies. Also the necessary and sufficient condition that a specified contrast 
[ab . . . A:] is measurable from a subset is that there exists a linear function involving only the 
assemblies in the subset such that it is not orthogonal io[ab . . .A:] but to every other function 
of this type with the possible exception of those defining the interactions which are absent. 

(b) A Class of Subsets Defined by Arrays of Strength d 

By using the properties of arrays of strength d we can define subsets from which main effects 
and interactions up to order (A: — 1) are measurable when interactions of order equal to and 
greater than {d ~ \) id > k) are absent. Incidentally the method of constructing contrasts 
defining specified interactions is also made available. 

Let the subset of s^ assemblies be an array (/V, w, s, 3) of strength 3. The function obtained 
from [ab , . . A:] by retaining only the assemblies contained in the subset may be represented 
by {a . A:}. Consider 0 . . . o} where a zfiO, This is evidently not orthogonal to 
[fl 0 . . .0]. Since all combinations of every 3 factors are equally repeated it follows that 
{aO , . . o} is orthogonal to {cf' bcO , . . o} and hence to [a' c 0 . . .0] when at least one 
of a\ b, c is zero, except when a' ^ a, b ^ c ^ 0. As this holds true for every pair of factors 
taken in conjunction with the first it follows that the contrast [« 0 . . .0] is measured by 
{aO . . . 0 } when all interactions involving 3 or more factors are absent. Similarly every 
contrast defining a main effect is measurable. Since {^aO . . . o}* is orthogonal to {^0 ^ 0 . . . o} 
it follows that any two contrasts defining main effects of two factors are orthogonal. In general, 
if interactions involving d and more factors are absent, then a sufficient condition for a subset 
to admit measurability of main effects is that it is an array of strength d. 

Similarly, if the subset is an array of strength 4, then ^abO , , . o} where a 0, b 0, 
measures the contrast [a 6 0 , . .0] when interactions of factors 3 and more are absent. A 
sufficient condition for the subset to admit measurability of first order interactions when inter- 
actions involving d and more factors are absent is that it is an array of strength (d 1 ). 

In general, when interactions of order equal to and greater than d - i are absent, then an 
array of strength (d H- A: -- 1) or /?, whichever is smaller, admits the measurability of interactions 
up to order (/r — 1) and the expressions defining these contrasts arc orthogonal. 

(c) Designs for Multifactorial Experiments 

Plackett and Burman (1946) have given optimum designs for multifactorial experiments from 
which main eftects can be estimated when all interactions of order equal to and greater than 
one are absent. This means choosing a subset of assemblies satisfying the optimum properties 
of an array of strength 2. It may be useful to find out multifactorial designs from which all 
main effects and first order interactions can be estimated except for a possible bias introduced 
by the presence of interactions involving d and more factors. An array of strength id 1) as a 
design for multifactorial experiments satisfies this requirement. Incidentally some more inter- 
actions may be measured from this array. 

The analysis of such a design can be carried out in the usual manner. The sum of squares 
for main eflfects and interactions can be calculated by retaining only the assemblies present in 
the array and using proper divisors. If we are using N assemblies with n factors each at s levels 
and the array is of strength greater than 4, then the analysis is as follows : 


Analysis of Arrays 


Main effects 

Degrees of freodoin. 

First order interactions 

.. -1)» 

Error 

(obtained by subtraction) 

Total 

N- 1 



136 


KhO-^Factorial Experiments Derivable from 


[No. 1, 


If the design is an array of strength {d 1), then the estimates of main effects and interactions 
arc unaffected by interactions of order greater than one and less than </, but the estimate of error 
is enhanced by their presence. But this keeps us on the safe side in declaring the significance 
of observed contrasts. If the design admits the estimability of contrasts other than those specified 
above, they may be removed from the component of error. 

Since by using (N, /?, 2t) we can measure all interactions up to order (/ — 1) assuming 

higher order interactions to be not present, the number (A^ — 1) of independent contrasts that 
can be built out of N assemblies must be greater than the number of independent contrasts that 
can be estimated ; hence we get — 

- 1 > (5 - I) + . . . -f ^Ct (s - \)K 

If we are using the array (N, //, .v, 2/ + 1), then interactions up to order (/ — 1) at least are 
measurable. If we take any contrast belonging to the interaction of (/ - 1 - 1) factors and retain 
only those combinations which are present in this array, then this is orthogonal to contrasts arising 
out of (/ + 1 ) factors which contain at least one factor in common with the first set. This shows 
that at least 'Q (.y -- more linear functions are measurable. Hence — 

( 5 - 1 ) 4 -...+ (s - ly 


+ « ^Qis \y^ 1 

When / 0, this inequality reduces to A^ — 1 >5—1, which is true. These are the inequalities 

mentioned in article 2 connecting N, the number of assemblies and //, the number of factors for 
given strengths 2/ and 2/ + 1 . 

A list of multifactorial designs involving less than 150 assemblies is given in Tables 8-10, 
The maximum number of factors given by the above inequalities is denoted by max n. 


Table 8. — Multifactorial Designs {Hypercubes of Strength 2) 
(Main effects measurable when interactions of order d ^ 1 arc absent.) 


N 


5 


n 

max 

4X 


2 

. 

4X-1 . 

4X 

3* 


.3 


4 

4 

3» 


3 


13 

13 

3* 


3 


40 

40 

4* 


4 


5 

5 

48 


4 


21 

21 

5» 


5 


6 

6 



5 


31 

31 

6* 


6 


3 

7 

7* 


7 


8 

8 

8* 


8 


9 

9 

9* 


9 


10 

10 

10* 


10 


3* 

11 

11* 


11 


12 

12 

12* 


12 


5* 

13 


The designs for 5 ~ 2 up to X = 25 are given in Plackett and Burman (1946). 

* Indicates possibly maximum number of factors for which the hypercube exists. 



1947] 


Combinatorial Arrangements of Arrays 


137 


Table 9. — Multifactorial Designs (Hypercubes of Strength 3) 
(Main effects measurable when interactions of order > 2 are absent.) 


■ N 


s 


n 


max n 

2» 


2 


4 


4 

2* 


2 


8 


8 

2^ 


2 


16 


16 

2« 


2 


32 


32 

2’ 


2 


64 


64 

3“ 

• 

3 


4 


5 

3* 

• 

3 


8 


14 

4» 


4 


6 


6 

5« 

• 

5 

• 

6 


7 

Table 10.- 

--Multifactorial Designs (Hypercubes of Strength 

(Main effects and first order interactions measurable when 
interactions of order d 2 arc absent.) 

N 


s 


n 


max n 

2* 

• 

2 

• 

5 


5 

2* 


2 


6 

• 

7 

2« 


2 

• 

8 

• 

10 

2’ 


2 


11 


15 

3* 


3 


5 


6 


Designs involving higher numbers of assemblies can be constructed by the methods given for 
constructing arrays of strength d. 


(d) Multifactorial Experiments with Groups of Assemblies 

In considering multifactoriaJ experiments it has been assumed that there exists no source of 
variation due to any assignable causes affecting the assemblies. In practice, it may not always 
be possible to get the effects of assemblies on homogeneous material. In such cases valid com- 
parisons are still possible if the assemblies are assigned at random to different portions of the, 
material. Another method which has the advantage of reducing the error to which the contrasts 
arc subject is to assign groups of assemblies to fairly homogeneous portions of the experimental 
material and build up contrasts from comparisons arising within the groups. This purpose is 
served by blocks in field experimentation. In pharmaceutical and other experiments it has been 
found that observations are affected by the temperature of the day on which the experiment is 
carried out. If the experiment involves a large number of assemblies with experimentation 
spread over a number of days, it is desirable to assign groups of assemblies to days in such a 
manner that desired comparisons are unaffected by temperature differences of days. 



138 


[No. 1. 


Rao — Factorial Experiments Derivable from 

If we are using only a subset of assemblies, then the problem reduces to splitting the subset 
into groups such tljat de;sired contrasts are orthogonal to contrasts arising out of groups. The 
only change in the analysis of variance table is to include the sum of squares due to groups 
with the appropriate degrees of freedom. 

Consider the designs corresponding to (S «, 2, 2) which admit the measurability of main 
effects only. We may split this into two groups so that the contrast of the groups is the inter- 
action of the first two factors A and B. This interaction is measurable from (A^, n, 2, 2) if all 
combnations of 3 factors involving both A and B occur an equal number of times. ^ a necessary 
condition it follows that N must be a multiple of 8 . If we want an arrangement in four groups 
then the two factor interactions AB, BC, CA of the three factors A, B and C may be chosen ^ 
contasts among the groups. A sufficient condition that these effects arc orthogonal to main 
effects is that all combinations of 3 factors involving any two A, B and C occur an equal 
number of times. 

The actual splitting of the array into two groups is done as follows: All those assemblies 
with identical levels for the first two factors are put in one group and the rest in another. To 
get a design in four groups all assemblies containing the combinations 111 , 000 ; 101 , 010 ; 
001 , 110 ; and 100 , 011 for the first three factors are put in the first, second, third and fourth 
groups respectively. The correspondence of (0, 1) with (-f, — ) can be made in any desired 
manner if we are using the array containing the elements and — used in their construction. 

Arrays satisfying the above properties can be constructed when N ~ 2''. To get designs in 
two groups we have to exclude the columns with even number of selections of Fi, Fg, . . . Ff 
article 2) whenever F, or Fa or Fi and Fa are involved, if Fi and Fa are made to correspond with 
A and B. To get designs in four groups we have to see that this holds good with any pair chosen 
out of Fi, Fa and /!,. A list of designs in two and four groups is given in Table 1 1 . 


Table 11 . — Factorial Arrangements in Groups 

(Main effects measurable when interactions of order d 1 are absent 
except AB for 2 groups and AB, BC\ CA for 4 groups.) 


N 

s 

Number of lactorB for 

23 

2 

2 RroiipH. 

6 

4 groupB. 

4 

2 * 

2 

14 

12 

2^ 

2 

30 

28 

2 « 

2 

62 

60 

2 ’ 

2 

126 

124 


In case we arc using multifactorial designs of the type (/V, w, s, 2) for 5 > 2 we may impose 
the condition that all combinations of 3 factors involving the first two are equally repeated. In 

Table 12 . — Factorial Arrangements in s Groups 

(Main effects measurable when interactions of order d > 1 
are absent except AB.) 


N 

s 

Number of factorn. 

3^ 

3 

11 

3« 

3 

38 

4.3 

4 

18 


5» 


4 


27 



1947] 


Combinatorial Arrangements of Arrays 


139 


this case (s — !)• contrasts are available of which a subset of (j — 1) contrasts can be used as 
contrasts of the groups. The assemblies in the array are divided into s groups with the help of 
the combinations of the first two factors. If these a* combinations are identified with the celb 
of a latin square, with the factors corresponding to rows and columns, then the combinations 
occurring with a latin letter determine the assemblies in a group. 

If we are using (N, n, a, 3) then the above conditions are automatically satisfied. Hence they 
can be split up into groups for the same number of factors given in Table 9, thus supplying 
designs for factorial experiments from which main effects are measurable without involving first 
order interactions. 

If we are using (N, w, a, 4) and desire to estimate main effects and first order interactions we 
have to choose for contrasts of the groups a three-factor interaction. As a sufficient condition, 
we may impose the restriction that all combinations of every 5 factors involving the first three 
occur equally often, in which case the contrasts of ABC are available for comparisons of the 
groups. The minimum number of assemblies is a*. 


Table 13. — Factorial Arrangements in 2 Groups 

(Main effects and first order interactions measurable when 
interactions of order d 2 are absent except ABC.) 

N A Number of factors. 

2 » . 2 . 6 


2 « . 2 . 8 


2 ’ . 2 . 11 


There is a choice in these factorial arrangements in groups as to which of the factors are to 
be identified with /!, B and A^ B, C whose interactions are used for confounding in the groups 
in the above cases. Since we are assuming that all interactions of this order with the possible 
exception of these are absent we may identify with A, B, C those factors which indicate slight 
interactions. If these interactions are not negligible, it may be, from a practical point of view, 
necessary to measure them, in which case arrays of higher strength have to be used. 


Bibliography 

* ^ Fisher, R. A, (1943), “Theory of Confounding in Factorial Experiments in Relation to Theory of 
Groups,*’ Atm. Eugenics^ 12, 291. 

2 (1945), “A System of Confounding for Factors with more than Two Alternatives Giving Com- 

pletely Orthogonal Cubes and Higher Powers,” ibid., 12, 283. 

® Kishen, K. (1942), “On Latin and Hypcrgraeco Latin Cubes and Hypcrcubcs,” Current Science, 
11, 98. 

^ Plackett, R. L., and Burman, J. P. (1946), “The Design of Optimum Multifactorial Experiments,” 
Biom.y 33, 305. 

Nair, K. R.,and Rao,C, R. (194J), “Confounded Designs for Asymmetrical Factorial Experiments,” 
Science and Culture, 7, 313. 

® (1942), “Confounded Designs for A: x x < 7 ” x Type of Experiments,”. / 6 /V/., 7, 361. 

7 (1947), “Confounded Designs for Asymmetrical Factorial Experiments,” Bioni. (m press). 

® Rao, C. R., “On Hypercubes of Strength c/and a System of Confounding in Factorial Experiments,” 
Bull. Cal. Math. Soc., 38. 67. 

® Yates, F., “Factorial Experiments,” Tech. Com. 35 Imp. Bu. of Soil Science. 



140 


Exhibition of Mechanical Aids 


[No. 1, 


Exhibition of Mechanical Aids to Statistical Computation 

On July 20, 1946, the Research Section held an exhibition of machines and instruments 
designed to facilitate some of the lengthy and tedious computations n^ssary in statistical work. 
The idea of organizing such an exhibition arose out of the “Symposia on Autocorrelation in 
Time Series” held earlier in the session, when a number of new machines for the calculation of 
autocorrelations were described, but could not be demonstrated at the meeting itself. The venture 
was more than justified by the excellent attendance and the great interest shown in the exhibits. 

The exhibition was arranged in two sections. At Imperial College, South Kensington, in a 
room kindly placed at the Society’s disposal by Professor H. Levy, the machines described at the 
Symposium were demonstrated in action. Particular interest was shown in the Relay Computor, 
designed by Mr. W. Barnes for the use of Dr. L. B. C. Cunningham and Mr. W. R. Hynd at the 
Ministry of Aircraft Production, and in the Correlogram Calculator made by the National Physical 
Laboratory to Mr. G. A. R. Foster's design. 

Calculating machines of the more orthodox type, including the National Accounting Machine, 
were also demonstrated at Imperial College by the Scientific Computing Service, who also 
arranged a comprehensive display of tables indispensable for statistical and other types of compu- 
tation. Dr. L. J. Comrie introduced the exhibition with a short lecture describing the relative 
merits of the various machines. 

In the afternoon, by courtesy of the War Office, visitors were invited to attend a demonstration 
of Hollerith equipment which was held at the Q. Stats. Department, Northumberland Avenue, 
and which was well attended. The essential part of the calculation of serial correlations (recently 
described in the Supplement, Part II, 1946) was demonstrated on the Reproducer, the well-known 
method of summary multiplication (or progressive digiting as it is called in the United States) was 
shown on a Senior Roller Tabulator, and numerous smaller demonstrations were given on the 
Sorter, Multiplying Punch and other punched card equipment. 

The Society is greatly indebted to all who co-operated in making the exhibition an unqualified 
success. In particular they wish to thank Professor H. Levy and Mr. G. Barnard of Imperial 
College, Dr. L. J. Comrie and his staff at the Scientific Computing Service, the Q. Stats Department 
of the War Office, and Mr, Michaelson of the British Tabulating Machine Company. 

The meeting was well reported by the Manchester Guardian in their “London Letter.” 



1947] 


141 


SUPPLEMENT TO THE. 

Journal of the Royal Statistical Society 

VoL IX, No. 2, 1947 


Statistical Investigation of Casualties Suffered by Certain Types of Vfssels. 

By S. Vajda. 

(Read before the Research Section of the Royal Statistical Society, April 2, 1947, 

Mr. H. L. Seal in the Chair.] 

This paper gives a description of some statistical problems and of the technique used to deal 
with them. Statisticians may be interested in the methods of analysis utilized in the 
investigation, which was carried out in the Statistical Branch of the Admiralty. 


1. The Material. ' 

Reports were received by the Statistical Branch of completion of new vessels of the types investi- 
gated and of casualties to these vessels. The precise definition of “Casualty” or “Accident” and 
the decision whether a succession of mishaps should be considered as a single casualty or not 
remained, of course, outside the duties of this Branch, which had to accept in this respect the 
information received. 

Reports were also received of events which made a vessel cease to form part of the investi- 
gation. This is not necessarily due to a casualty resulting in a complete loss, but may be a con- 
version to some other type, sale to another Government, or generally any reason which makes it 
certain that future casualties will not be made known to the collecting section in the routine way. 
Such events are referred to as “Exits.” 

The technique of investigation was developed so as to make the best use of an installation of 
Hollerith machines consisting of punches, sorter, reproducer, collator and tabulator. A detailed 
description of the work carried out on them is given in the Appendix. 

The basic notion of this investigation is the “Accident Rate.” It is defined as the ratio of the 
number of accidents and the nuiViber of vessels, each vessel multiplied by the number of days 
during which it was exposed to the risk of accidents. It should be mentioned, however, that no 
deduction was*made for days spent in harbour, dock, etc. It would have been impossible with the 
data available to take account of these days. On the whole, the effect of this inaccuracy is small. 
Its tendency is to underestimate the accident rate. 

This paper deals only with a period covering the three years from December 1, 1941, to 
November 30, 1944. The following table will give some idea of the extent of the corresponding 
material : 



142 


Vajda — Statistical Investigation of Casualties 


[No. 2, 


Table 1. 


Month. 

V’eftHolg 

completed. 

FirHt All 

Cagualtlos. 

Exits. 

Montti. 

Vesgols 

completed. 

First All 

t^asimltics. 

Exits. 

Dec., 

1941 

. 2 . 

— 


— 

. — 

June, 1943 

. Ill . 

16 


16 

7 

Jan., 

1942 

. 3 . 

— 


_ 

— 

July „ 

. 106 . 

12 


13 

. 10 

Feb. 

99 

. 11 . 



— 

— 

Aug. „ 

. 108 . 

11 


15 

6 

Mar. 

9 9 

. 16 . 

— 


— 

— 

Sept. „ 

. Ill . 

5 


7 

. 10 

Apr. 

9> 

. 26 . 

— 


— 

. — 

Oct. 

. 116 . 

21 


27 

. 15 

May 

99 

. 42 . 

— 


— 

. 2 

Nov. „ 

. 100 . 

16 


22 

. 13 

June 


. 51 . 

_ - 


— 

. 3 

Dec. „ 

. 124 . 

49 


61 

. 12 

July 

99 

. 52 . 

— 


— 

. 8 

Jan., 1944 

. 75 . 

57 


90 

. 12 

Aug. 

99 

. 58 . 

1 


1 

. 2 

Feb. „ 

. 78 . 

47 


69 

. 5 

Sept. 

99 

. 67 . 

2 


2 

4 

Mar. . 

. 81 . 

70 


111 

. 7 

Oct. 

99 

. 65 . 

1 


r 

. 3 

Apr. „ 

74 . 

36 


63 

. 5 

Nov. 

9 9 

. 64 . 

4 


4 

. 7 

May „ 

64 . 

29 


39 

. 2 

Dec. 

99 

. 81 . 

15 


16 

1 

June „ 

55 . 

13 


23 

. 6 

Jan., 

1943 

. 77 . 

13 


14 

. 8 

July „ 

49 . 

4 


7 

. 5 

Feb. 


. 80 . 

22 


24 

. 7 

Aug. „ 

48 . 

9 


17 

. 4 

Mar. 


. 99 . 

30 


33 

. 19 

Sept. „ 

*. 42 . 

8 


15 

. 3 

Apr. 


. 107 . 

11 


16 

. 8 

Oct. 

. 46 . 

7 


12 

. 3 

May 


.116 . 

18 


22 

. 4 

Nov. , 

. 42 . 

2 


4 

. 2 


Totals . . 2447 . 529 . 744 . 203 


There were altogether — 



373 

vessels 

with 1 casualty 

373 casualties 


117 

♦» 

»t ^ »» 

234 


24 


»» ^ »» 

72 


11 


„ 4 ,, 

44 


3 

»» 

5 „ 

15 


1 

»♦ 

6 

6 

Total . 

. 529 vessels with altogether 

744 casualties. 


Montli. 

Vessels 

Table 2. 

in Service During (he Second Half of the Month. 
Number. Mouth. 

Number 

December, 

1941 

2 

June, 1943 

1045 

January, 

1942 

5 

July, 

1141 

February, 


16 

August, 

1243 

March, 

»» 

32 

September, . 

1344 

April, 


58 

October, 

1445 

May, 


98 

November, „ 

1532 

June, 

»> 

146 

December, „ 

1644 

July, 

»» 

190 

January, 1944 

1707 

August, 

»» 

246 

February, 

1780 

September, 

9 * 

309 

March, 

1854 

October, 

99 

371 

April, 

1923 

November, 

>9 

428 

May, 

1985 

December, 

99 

508 

June, 

2034 

January, 

1943 

577 

July, 

2078 

Febniary, 

99 

650 

August, . , 

2122 

March, 

99 

730 

September, . 

2161 

April, 

99 

829 

October, 

2204 

May, 

♦ 9 

941 

November, , 

2244 



1947] 


Suffered by Certain Types of Vessels 


143 


1918 vessels remained without a casualty during the period under review. 

For the construction of approximate accident rates, we assume that in any calendar month 
exits and entries all happened in the middle of the month. A table of vessels in service during 
the second half of that month can then be constructed by subtracting the number under “Exit” 
from that under “Vessels completed” and accumulating the results line by line. The result is 
shown in Table 2. 

If we take the means of the previous and the current monthly numbers, e.g. 1 for December, 
1941, for January, 1942, etc., and multiply these means by the numbers of days in the current 
month, approximate numbers of ship-days are arrived at; these are given in Column 2 of Table 3 
below. Comparison with the casualties of the same month, as given in Table I, leads to the 
accident rates in Columns 3 and 6 below. 


' Table 3. 


Mouth. 

Dec., 1941 


Ship-days 

('Xpostid, 

31 0 

Acridcut rates 

X 10,000. 

Month. 

June, 1943 


Ship-days 

»*xiM>Hed. 

29,790-0 

Aeeitlent rates 
X 10,000. 

5-37 

Jan., 1942 


108-5 

— 

July 

>» 


33,883-0 

3-84 

Feb. „ 


294-0 

. — 

Aug. 

>» 


36,952-0 

4-06 

Mar. „ 


744-0 

. — 

Sept. 



38,805-0 

1-80 

Apr. „ 


1,350-0 

— 

Oct. 



43,229-5 

6-25 

May „ 


2,418-0 

— 

Nov. 

»♦ 


44,655-0 

4-93 

June „ 


3,660-0 

— 

Dec. 



49,228-0 

. 12-39 

July 


5,208-0 

^ — 

Jan., 

1944 


51,940-5 

. 17-33 

Aug. „ 


6,758-0 

1-48 

Feb. 

»> 


50,561-5 

. 13-65 

Sept. „ 


8,325-0 

2-40 

Mar. 

>» 


56-3270 

. 19-71 

Oct. „ 


10,540-0 

•95 

Apr. 



56,6550 

11-12 

Nov. „ 


11,985-0 

3-34 

May 



60,574-0 

6-44 

Dec. „ 


14,508-0 

. 11-03 

June 

»» 


60,285-0 

3-82 

Jan., 1943 


16,817-5 

8-32 

July 

>> 


63,736-0 

I-IO 

Feb. „ 


17,178-0 

. 13-97 

Aug. 



65,100-0 

2-61 

Mar. „ 


21,390-0 

15-43 

Sept. 



64,245-0 

2-33 

Apr. „ 


23,385-0 

6-84 

Oct. 



67,657-5 

1-77 

May „ 


27,435-0 

8-02 

Nov. 

»> 


66,720-0 

-60 


Total .... 1,112,4790 . 6-69 


The accident rates show very conspicuous peaks in winter and troughs in summer. If they 
arc telescoped into rates for months irrespective of calendar year by adding the exposed ship-days 
and accidents and computing the ratios of these items, the following table results : 


Table 4. 


Month. 

Ship'dayH. 

Aeoidentfl. 

Kate. 

Month. 

Shlp-dayH. 

Aocidenls. 

Kate. 

Dec. 

63,767-0 . 

77 

12-08 

June 

. 93,735-0 

39 

4-16 

Jan. 

68,866-5 . 

104 

15-10 

July 

. 102,827-0 

20 

1-95 

Feb. . 

68,033-5 . 

93 

13-67 

Aug. 

. 108,810-0 

33 

3-03 

Mar. . 

78,461-0 . 

144 

18-35 

Sept. 

. 111,375-0 

24 

2-15 

Apr. . 

81,390-0 . 

79 

9-71 

Oct. 

. 121,427-0 

40 

3-29 

May 

90,427-0 . 

61 ' . 

6-75 

Nov. 

. 123,360-0 

30 

2-43 




Totals 

• 

1,112,479-0 

. 744 

6-69 

It would be interesting to apply 

some periodicity analysis to these data. Such an 

analysis 


would have to take account of the different weights attached to the rates and no satisfactory 
method appears to be available. 



144 


Vajda — Statistical Investigation of Casualties 


[No. 2, 


2. The Problems and the Tables which Form the Basis of the Analysis, 

The problems which were investigated and whose treatment can be considered typical of 
statistical work carried out with the aid of a punch-card installation were these : 

(i) Is the accident liability of all vessels the same? Or do the casualty rates show 
differences from builder to builder, say ? 

(ii) Does the accident liability change with the increase of age ? 

(iii) Does the fact that the vessel has been subject to an accident alter the probability of 
further accidents? 

In order to find an answer to these questions, a series of tables was constructed. Tables I, 11 
and III show, for All Casualties, First Casualties, and Later Casualties respectively the material 
distributed according to Age Attained and Builders. The successive age groups 1, 2, etc., stand 
for a period of 50 days each, viz. 0-49, 50-99, etc. The columns give — 

(1) The number of days exposed at the corresponding ages. 

(2) The number of casualties suffered at these ages. 

(3) The ratio (2)/(l), i.e. the accident rates, multiplied by 10,000 for convenience. 

The grand totals are : 

744 casualties in 1,110,838 days, giving an over-all rate of 6-6977 x 10~^ for all casualties. 
When comparing this result with Table 3 it should be remembered that the days given there were 
approximations only. 

529 first casualties against 922,716 days exposed, giving a rate of first casualties of 5-7331 x 10 
and finally — 

215 later casualties against a total of 188,122 days exposed, which gives a rate of later casualties 
of 11-4288 X 10 ^ 

For reasons connected with theoretical considerations set out in the next Section the following 
tables were also constructed: 

A frequency distribution of the intervals which preceded the accidents (Table IV). 

A distribution of casualties according to their order and to the age of the vessel at the time of 
the casualty (part of Table VII). 

Table V : Here the vessels are sorted according to their highest age attained during the period 
of investigation. For each group the final stratification into vessels with 0, 1, 2 . . . casualties 
was recorded. The table includes a column giving the total of all casualties and the ratio 
casualties/vcssels for each line. The column “Number of Vessels” is identical with the 
corresponding column of Table D of the Appendix. 


3. Analysis of the Material. 

The previous chapters give a summary of the material collected. As far as results are concerned 
the tables which emerge are, perhaps, disappointing, in that they allow conclusions to be reached 
almost by inspection and consequently do not leave much scope for subtle analysis. However, 
as this paper is concerned with methods rather than results, a few possible ways of extracting 
information will be mentioned, in the hope that they may be of some value in future, similar 
investigations. 

a. Distribution of Intervals. 

As a preliminary to more detailed tests, we ask whether it may be assumed that the accident 
liability, being the same for all vessels, is independent of age and of previous accidents. The 
question can be answered by a consideration of the spells of service between completion and first 
accident and those between successive accidents. 

If the probability of an accident during the time interval sjN is qsjNj then the probability that 
the first accident happens in the ^th of such successive intervals is clearly 


( I — qs/N )^-^ . qs/N. 



1^47] Suffered by Certain Types of Vessels 145 

When N tends to infinity, this probability tends to qexp^-- qs)ds. This is then the probability 
density of intervals of length 5 between accidents.* The figures in Table IV do not agree with thii 
frequency distribution. 

In order to find some explanation for this fact we calculate the average intervals for all builders 
and obtain: 


Table 5. 


Builder. 


Total length of 
intervala. 

Number of 
casualties. 

Average 

Intervals. 

A 


22,759 

107 

213 

B 


21,274 

104 

205 

C 


33,988 

141 

241 

D 


20,933 

83 

252 

E 


43,177 

197 

219 

F 


13,361 

48 

278 

G 


9,568 

64 

150 

Total 


165,060 

744 

222 


It is obvious that there are very significant differences between builders. The following table, which 
classifies casualties according to whether they were first or later accidents, leads to the same con- 
clusion. It must be clear, however, that they refer only to those vessels which did suffer at least 
one casualty. Thus, taken by themselves, they can only give a very rough indication. 


Builder. 

Table 6. 

Average of spell before 
first casualty. 

Average of intervals 
before later casualties. 

A 

234 

152 

B 

214 

185 

C 

294 

113 

D 

317 

94 

E 

265 

134 

F 

286 

211 

G 

146 

165 

All 

255 

140 


b. Contingency Table Analysis. 

We will now attempt to find out something more about the possible reasons for differences in 
the accident rates. There are at least three which may be operative — origin from different 
builders, dependence on age, dependence on previous casualties. We therefore have constructed 
Tables II and III, which give : E (the days exposed) and C (the casualties) for every combination of 
the attributes age (a), builder (b) and order of casualty (c) (i.e. whether first or later casualty). 
The combination of these tables can be considered as a threefold contingency table with entries 
of unequal weight. In order to avoid some of the consequent complications, we shall use hypo- 
theses somewhat different from those usually tested. Instead of stating hypotheses such as: 
“There is no (a)-effect,” or “no (flZ?)-interaction,” we test hypotheses of the type: “there is no 
effect apart from, possibly, (a)” or “no effect apart from, possibly, (a), {b) and/or (a6),“ etc. The 
test for these hypotheses is comparatively simple, even for unequal weights, as will be shown in 
the following paragraphs *). 

Let the number of days exposed in the position (a, b, c) be denoted by Eabe* l^e corresponding 
casualties by and the ratio Cabd^abc Qabo^ Suppose the number of values of a, b and c 

* This approach was first suggested by D. R. Cox (•). 


SUPP. VOL. IX. NO. 2. 


L 



146 


Vajda — Statistical Investigation of Casualties 


[No. 2 , 

to be /? + 1, ^ 4- I and r -f- 1 respectively. We then fix constants Wtjjfc (/ = 0 , I, . • p] J ^ 0 , 

1 . . . ^ r= 0, 1 . . . r) so that, for all a, b and r, we have 

Qabe='^ «V* Pi{a)Pj(b)Pk{c) ( 1 ) 

t J, k 

where the Pi are orthogonal polynomials of order i such that == 0 for / 4* j and 

a 

similarly for b and c. (/»<(«) and /*,(/>) are of the same order /, but they are not identical if their 
arguments extend over different ranges.) Moreover, we assume for convenience that an arbitrary 
constant factor is fixed so that 2 [/’,(«)]* = 1 and similarly for b and c, as before. The 

solution of the set (1) with respect to gives 

Hij, = 222 Piia)P,{.b)Pt{c) ( 2 ) 

a b c 

Wc want now to find out what it means if ~ 0 for all values of / which are different from 
zero. Because of ( 2 ), this means 

{a) :: 0. (/ - 1, 2, . . . p) . . . . ( 3 ) 

a be 

We have here p homogeneous equations for the p 4 - 1 unknown X Q^bc “ Qtwo^ say (a ~ 

be 

2 , . . . p V 1 ). 

In view of 2 /*<{<») 0 we see that ( 3 ) is cquivaleiit to 0 ,„„ = Q^oo • • • = Gp i-i oo- 

a 

Thus 0 for all values of i 0 means that there is no (r/)-eflrect. 

To go one step further, let — 0 for all values of / and j different from zero. This leads to 
p.q homogeneous equations for the (p -f 1) f 1) unknown 'S.Qabc =" Qabo^ say. 

We can therefore choose, say, Guo. Ono . . . Ot /,4 i o» G2i»» Gsio . . . G(z+i io» and the others 
will thereby be fixed. It is immediately seen that the above set of linear equations can now be 
written 

Giio G210 ” G12O G220 ~ • • • "~Gi g-f 1 0 G2 g'-f 1 0 

Quo Qp\ I 10 Q120 ' Qp^ 1 20 . . • -—Qiq 1 10 QpA i j+i 0 

Thus, if nijo - 0 for all values of i and j which are not zero, there is no (ab) interaction. 

Generalization to more than three subscripts and to higher order interactions is obvious, but 
will not be pursued here. 

Let us now assume that the variance of all Qf,bc is the same. Then if we wish to test the signifi- 
cance of any effect we put the appropriate coefficients zero and decide whether the minimum of the 
sum of squares 

IQabc - ^ nijk P^(a)Pj{b)Pjfc)f .... ( 4 ) 

ale ijk 

differs significantly from zero. The normal equations of this set have the same solutions for the 
remaining nijk as were given in ( 2 ), whatever terms we omit. It follows that the minimum of ( 4 ) is 

q:,, - a,be Piia) P,(b) P,{c)r - 2: [L QabcPiia)P,>{b)P,\c)Y 

abe {}jk) abo {C/k') abc 

where the set (/, 7, k) refers to those coefficients which were retained and (/', /, k') to those which 
were omitted, it will be noticed that these expressions are, in fact, those which are characteristic 
of the various effects and interactions tested in the traditional Analysis of Variance technique. 
The more terms that are omitted, the larger, of course, is the minimum which is obtained. The 
number of degrees of freedom is equal to the number of terms in expressions like ( 4 ) which have 
been omitted. 

This simple procedure cannot be applied when the variance of Q(^bc is not a constant, but is 
proportional to Eat- In this case the expression which must be made a minimum is 


^ P'ubelQabc ^000 ^loo P i(P) • • • ]* • 

abc 


( 5 ) 



1947] 


Suffered by Certain Types of Vessels 


147 


The testing of a hypothesis still consists in omitting certain appropriate terms on the r.h.s. of (1), 
and comparing the resulting minimum of the quadratic expression of the type (5) with an estimate 
of the variance of the weighted mean of the Q’s. 

If, then, we test the hypothesis that no effect exists apart from (a), or in other words that not 
only (b) and (t), but also (be), {ab) and (ac) are not significant, then we retain only those terms of 
(5) which have zero subscripts in the second and third place, obtaining 


^ ^abc^Qabc ~ ^loo^ i(^) ... UpQ^ 

abc 


which must be made a minimum. Simple calculations show that the normal equations for this 
condition are solved by 


S Tiia) 




and that the value of the minimum thus obtained is 


y Ca(«_ y 

ahe ^abc a ^ 


( 6 ) 


Similarly, the hypothesis that no effect is significant apart from, possibly, (rt), (/>). and/or (ah), is 
to be tested by 

m 

abc. ah -J ‘^abe 


In the present case the following sums were obtained : 


Table 7. 

2 2 :^{CXIEa^} - -87296 
J -•70425 

^ Hr =^-«^32 

Kf ? -•52525 

K? ^ C^.c/2^£«6„} =-57341 

K? ^ ^ = -54900 

^ -49830 

The following list contains some hypotheses and the corresponding sums of squares, obtained 
from expressions like (6) and (7), with their degrees of freedom : 


HvpothcsiH : 

No clfect apart from 

(a) , (b), (ab) 

(b) , (c), {be) 

(a) , (c), (ac) 

(«) 

(b) 

(c) 

No effect at all 


Table 8. 


d. of f. 
n. 


s.s. 


x’- 

•sj 2\* — \/ 2n “ J 

154 


•16871 


253 

50 

294 


•25266 


379 

3-3 

264 


•27264 


409 

5-6 

286 


•34771 


522 

8-4 

301 


•29955 


447 

5-4 

306 


•32396 


486 

6-4 

307 




562 

. . . 



148 


Vajda — Statistical Investigation of Casualties 


[No. 2, 


It must now be decided what estimate of the variance of the weighted mean of the Q*s should 
be used. In the customary Analysis of Variance method a certain aggregate of effects (usually 
some combination of higher order interactions) is assumed to be not significant. Omitting these 
terms, we obtain a minimum Afj. We then omit further terms, which are characteristic of the 
effects to be tested, and obtain thus a (higher) minimum M,; {Mt — is then entered in a 

table of the variance ratio, such as Table V in (•), and its significance gauged. 

In the present case we do not know of any effect or aggregate of effects which is certain to be 

S C* 

without significance. However, in view of the smallness ol q ^ ^ we can take q itself as 

an estimate of the population variance of EaltQabc- T'he degrees of freedom on which this 
estimate is based are certainly large enough to use that line of the table of the variance ratio which 
corresponds to x® and therefore, instead of dividing the sums of squares by their degrees of freedom, 
we divide by q and enter the x* tables to test the significance of the result. This has been done 

in the table a^ve and we find that the age (a) has some eflFect, although a much smaller one 

than the other attributes. Thus a result which can be guessed by the inspection of the tables 
is confirmed. 

Having thus found that the age has some effect, it is natural to ask whether, on the whole, 
higher ages lead to more or to fewer casualties. The coefficients of linear regression of accident 
rates against age were computed for first casualties and give the following picture: 

Table 9. 

Builder . A . B . C . D . E . F . G . All . 

Coeff. X 10« . - 2-27 . - 26 09 . 2*16 . 23-66 . - 15-01 . 3-58 . -105-28 . - 4-98 

These are the regressions on age groups, denoting age 0-49 by 1 , 50-99 by 2, etc. The signs 
show that on the whole the occurrence of accidents decreased with higher ages. No test of signifi- 
cance of these regression coefficients was carried out; it would, in fact, have been illogical to do so, 
because we know already that the age has some effect. 


c. An Application of the Theory of the Stratified Population. 

Another promising approach can be based on a paper by M. Greenwood and G. U. Yule(^). 
After showing the defects of earlier attempts the authors give, in Section 111 of their paper, formulae 
for the distribution of a population into groups of members who have suffered 0, 1, 2 . . . 
accidents respectively, within a given time. It happens that those results, which are relevant to our 
purpose, are special cases of some work which was concerned with a stationary stratified 
population which is subject to mortality and to promotion!*, *). A few remarks on this subject 
will perhaps be appropriate here. 

Let us first consider the population of vessels before their first accident, if any. 

If the rates of accident were the only operative decremental forces, then their operation could 
be described as follows : 

Let the number of vessels without accident at time t be lt° and define a rate of accident by 


^oe = 


die 

ledt 


then clearly /o® exp(— /*Voe</f), and if is a constant, say, v^, we obtain the value 
exp {—vj) for the proportion of all vessels which have not had an accident. 

Now in our present case there are other forces of decrement working, because vessels disappear 
from If for other reasons than accidents. Let us denote the number of all vessels observed at age t 
by It and introduce, as force of decrement, operating on them (and representing exits as well as 


disappearance through reaching the end of the period under investigation) (x^ = This 

gives It = Ip exp (— /' [itdt). Now le is itself subject to [Xt and v^. Because Ip = /o®, and 
the two forces of decrement arc independent, we have 


le = lo exp [ -yj(fx< -f vpt)dt] = It exp(-/'voj<//). 



Suffered by Certain Types of Vessels 


1947] 


149 


Once again, if is constant, If — /< exp(— v^f). The factor /| in lieu of Ij* shows the difference 
bctw^n Greenwood and Yule’s earlier approach and our present case. 

Similar reasoning can be applied to vessels which have already had accidents, and Greenwood 
and Yule give in formulae (32a) to (32f) the basis for computing the expected frequencies of 
vessels with 0, 1 . . . accidents after time T, always assuming that no other decremental force 
exists, and moreover supposing that the forces which lead to casualties of different orders are 
different. They then depart from the latter assumption and we are here particularly concerned 
with formula (33), which distinguishes only between a rate for the first casualty, and another 
rate, Vj, for all other casualties. Our own material is in any rate too scanty for further differen- 
tiation. Writing 5 (vj - v^)/, their formula (33) gives the following stratification after time t \ 


Table 10. 


Number of casualtleR Buffered. 




3! 


N.B. /( 8 )= 


1 — 


We can now take stock and see which observed values can be compared with theoretical values, 
in order to test the hypothesis that is constant (but not, of course, (x^). We have in Table I 
the observed values of 

('■) hdt 

and in Table JI the values of 

(ii) ledt h exp (-fo'^adOdt. 

Moreover, we have the casualties up to time / 

(iii) Q =Jllt exp 't^t) v^dt =/' /«" 

Either part of (iii) can be used for the test whether may be replaced by a constant v^. We 
can either consider 

c., = V^“/, exp (-V„)rf< _ v„ i: L,,. , exp (approx.) 

^ i \ l ^ 

(where w is the highest age attained) and find so that this equation is satisfied; then 
‘i f 1 = 'M i exp ('* ' 'y •)] (approx.) 

can be compared with the observed numbers of days exposed to first casualties. Alternatively, 
and this is simpler, we can take Cw calculate v^, from this and compare the observed 

values of casualties between ti and ti + i with 

M exp ('* ' 'y--')]- 

In this case there is, of course, no reason why the total of ail expected casualties should equal 
that of the casualties actually observed. The latter procedure was adopted for first casualties 
and similarly for later casualties, taking as values of Vq and 5-7331 x 10- * and 11-4288 x 10’* 
respectively. Table VI gives the calculated days exposed up to the times of the successive 
casualties, whereas VII shows the accidents which would have arisen under the hypothesis of 
constant v’s, together with those which actually did arise. For instance, we have multiplied 
120,087, which appears in the first line of Table I, under “Totals,” by exp (— 25 x 5-7331 x 10 *) 



J50 


Vajda — Statistical Investigation of Casualties 


[No. 2, 


- -98577, to obtain 118,378 in Tabic VI. Further multiplication by 5-7331 x 10-* gives 67-9 in 
Table VII. The agreement is bad, which is not surprising, as all the builders were lumped 
together. Similar calculations were therefore made for builders D and F separately, and Table 
Vi II, which gives calculated and observed first and later casualties, shows a much better fit, 
although one would still say that it is very far from perfect. 


d. A Generalized Regression Analysis. 

In order to cope with the difficulty of giving due weight to the ages of the vessels under obser- 
vation yet another approach may be adopted. This consists of the separation of the total popula- 
tion into subgroups according to the highest age attained by the individual vessels. For each 
subgroup the distribution according to casualties suffered was obtained and the result is given 
in Table V. Now if the average rate of accident were ^ — 2 and if Qx is the ratio of 

r 

casualties to vessels in the sub-group with vessels whose highest age attained was Xy then 
we should expect to be approximately qx. Similarly, 2 ~ is a measure of the 

goodness of fit. 

At a first glance this looks like a linear regression analysis, but it will be found on closer investi- 
gation that this impression is mistaken. Let us try to fix the number of degrees of freedom. 
If q were taken from theory without recourse to the data, then the number of degrees of freedom 
would be 22. On the other hand, if q were found by the method of least squares, 21 would be the 
appropriate number. But the least square value of ^ is 2 Cja72 which is different from 
2 Cj./2 Ffjc given above. Thus the testing of the fit by the chi-squared method is only an approxi- 
mation. It gives 105, which is large enough to reject the hypothesis without scruples, and this 
is in accordance with our earlier results. However, it is more interesting to investigate the exact 
sampling distribution of the expression 


V 

X 





2Q 

2K^ 



To obtain a slightly more general result we consider the expression 2(yi — coif, where the yi 

i 

are sample values from a normal population with variance 1 . We then- introduce c = 
so that the quadratic expression becomes 2 (vi — ai^bkyj^K ‘ ^ 

i 

Now it is known from Cochran’s papcr(’) that this expression is distributed as the linear form 
2 XiZu where the vary independently as a x* distribution with 1 degree of freedom and the 
\ are the non-zero latent roots of the matrix of the quadratic form. If m of the latent roots are 
unity and all the others arc zero, then we obtain a distribution with m degrees of freedom. 
Now the matrix of the form is in our case equal to MM\ where 



so that M 4s only singular when 2aA- ^ 1 . If the bi are taken from the least sqpare solution, 
then this is the case, and it is known that a x* distribution is obtained. In the case with which 



1947] 


Suffered by Certain Types of Vessels 


151 


we are concerned, yi corresponds to c to and ai to x\/ V\, hence bi corre- 

sponds to \/Fj./SF^. Again = 1, but the remaining squares do not, in general, have 
the same variance, and the x* distribution docs not apply. 

A more detailed investigation of the latent roots of the matrix MM' and generalizations to 
include the fitting of more than one constant must be left for a later occasion. 

A glance at this paper will show that it was not written at a first attempt. During its growth 
I have had the benefit of discussions with many friends and colleagues, some of whom were 
technical experts and some who were competent in statistical theory. 

References. 

* Greenwood, M , and Yule, G. U. (1920), “An Fnquiry into the Nature of Frequency-distributions Repre- 

sentative of Multiple Happenings,” J.R.S.S., 83, 255. 

® Seal, H. L. (1945), “The Mathematics of a Population Composed of k Stationary Strata,” Biomeirika. 
33, 226. 

® Vajda, S. (1945), “Introduction to a Mathematical Theory of the Graded Stationary Population,” SRE. 
Dept. Admiralty. 

4 (1944) “The Algebraic Analysis of Contingency Tables,” J.R.S.S., t06, 333. 

* Yates, F. (1934), “The Analysis of Multiple Classifications with Unequal Numbers in Different Sub- 

groups,” J. Amer. St. Ass. 

® Fisher, R. A., and Yates, F. (1947), Statistical Tables for Biological, Af^ricultural and Medical Research. 
3rd cd. 

’ Cochran, W. G. (1934),“The Distribution of Quadratic Forms in a Normal System,” Proc. Camh. Phil. 
Soc., 30, 178. 

^ Cox, D. R., “The Derivation of Significance Tests for the Differences Between Accident Rates,” RAE 
Report No. SME 3367. Royal Aircraft Establishment, Farnborough. 

Appendix. 

The Work Done on the Hollerith Installation. 

1 . Collection of Material and Creation of the Card Files. 

Schedules were completed by the collecting section giving basic data for each vessel included in 
the investigation. (Specimen 1 attached.)* The schedule headings 1-25 refer to the columns of 
an Sfi-column Flollcrith card on to which the information was to be punched. The schedule 
contains also entries which were not intended to be recorded by punching and therefore were not 
headed by a column number. One of them is the name of the vessel, which was often subsequently 
altered, making it unsuitable for purposes of identification. An appropriately defined “registered 
number” was used instead. Column 1 was left free altogether; the entry in column 20 was always 
“ 0 .” 

One card was punched for every line of the schedule and thus the History Card (H.C.) file was 
created. This file was always kept in strict Registered Number order. The cards were later to 
receive the details of each casualty suffered, as the relevant reports came in. 

A specimen of a H.C. is attached. A detailed explanation of the columns is given in Table A.* 
The punching of the data concerning casualties and exits on to the H-C.'s was not done directly 
from schedules, but by automatic reproduction from Change Cards, which were cither Casualty 
Cards (C.C.) or Exit Cards (E.C.), and which will be described presently. However, it is necessary 
to point out at this stage that as casualties were recorded on the H.C., their number was punched 
into column 20. Thus this column, which contains only “0” as long as no casualty was reported, 
accumulates further entries and gives, by its last punching, the number of casualties suffered by the 
vessel up to date. 

It should perhaps be mentioned that the data were collected retrospectively, but in principle 
it would have been possible to receive them as vessels were completed. 

As the Reports of Casualties came in casualty schedules were prepared. (Specimen 2 attached.) 
They contain the name of the vessel and cover columns 1-5 (I remaining blank) for identification, 
20, and 26-37 for the description of the casualty. Column 20 and the last four columns (34-37) 
were, however, left blank by the collecting section. The Duration of Spell of Service is, in the 
case of the first casualty, the difference in days between the Date of Completion and the Date 
• Tables A, B, C, D and E and Specimens 1 and 2, are given at the end of this Appendix. 



152 


VA3DA— Statistical Investigation of Casualties 


[No. 2, 


of the Casualty. In the case of later casualties the spell of service is the difference between the 
previous casualty and the casualty then considered. Hence, before making any entry in these 
columns, the number of the casualty just reported had to be ascertained. This was done by the 





CASUALTY CARO 



Hollerith section. The H.C. card with the same Registered Number was located and the last 
number punched on column 20 as well as the date of the corresponding casualty was noted (but 
without taking the card out of its file). The next higher number was then entered on the schedule 
in column 20 against the casualty reported and the spell of service computed and entered in 34- 
37. A Casualty Card (C.C.) was punched fqr every line of the Casualty Schedule and thus the 

















1^^^] Suffered by Certain Types of Vessels 153 

C.C. file was created. It was to serve two purposes : firstly, to build up the H.C. file with casualty 
data, and secondly, to provide the data for statistical investigations. 

A specimen of a C.C. is attached, and full details are given in Table B. 

The field consisting of columns 2^37 and headed “Latest Casualty” probably requires clarifi- 
cation. In order to simplify casualty investigations, it was necessary to have all casualties punched 
in the same columns irrespective of whether 1st, 2nd or 3rd, etc. Thus, columns 31-37 will always 
be punched, plus a repetition (except for a first casualty) within the correct casualty field. To 
amplify this explanation: If one History Card is punched with casualty data up to column 51 
(i.e. 3rd Casualty), then there will be three Casualty Cards in the Casualty File, each separately 
recording the 1st, 2nd and 3rd casualties in columns 26-37, as well as within their appropriate 
fields. The C.C.’s punched from the schedule do not yet contain entries in columns 6-19 and 21- 
25. These details are to be taken from the corresponding H.C.’s and the latter have therefore 
to be extracted from the H.C. file by the Collator. The next step consisted in the transfers of 
6-19 and 21-25 from the H.C. to the corresponding C.C. in the same columns. This was carried 
out by the Reproducer, which checks, at the saihe time, that the two files are in identictil order, 
so that the transfer is not made on to a wrong card. This completes the punching of the 
C.C.'s. 

The following step was to transfer the casualty data from the C.C.’s to the H.C.'s. It was, 
however, necessary to treat Isl Casuatlies separately from all others; therefore the H.C.’s were 
sorted on column 20 and those containing only a “0” were separated from the remainder. The 
corresponding Casualty Cards were those which had a “I” in column 20, and these were also 
separated. As before, the data were transferred, in this case from C.C. to H.C., by means of the 
Reproducer. By inlcuding column 20 in this transfer, the H.C. always contains the number of 
casualties so far repoi ted. / 

All that remains is to file the H.C.’s and the C.C.’s back into their respective files by means of 
the Collator. 

Finally, for each exit an Exit Card was punched, giving the Registered Number and the Date. 
The latter was then transferred on to the corresponding H.C. by a process which was analogous 
to (though shorter than) the one just described. The exit cards were marked by a Y in column I. 
i.e. by a hole in the top position. 


2. The Treatment of the Cards for the Construction of Tables 

The H.C. file was not supposed to be disturbed for any length of time, because it has always 
to be available for reference, and it was therefore decided that copies were to be made if any H.C.’s 
were to be used for the investigation. In the sequel we will only refer to the reproduced H.C.'s. 

The first problem to be dealt with was the determination of the total of days of service during 
which every ves.sel was exposed to a first casualty. In this connection all vessels had to be classified 
according to the occurrence or not of casualties and simultaneously according to whether or not 
they were in service up to November 30, 1944. 

The four resulting groups had to be dealt with difterently. Those with casualties had the 
total of days of service already given in columns 34-37 of their H.C.’s. I'hosc without casualties 
were, of course, blank in these columns. They were now sorted according to the date of delivery 
(columns 21-25), and then listed on the Hollerith Tabulator to produce the Registered Number, 
the day of Delivery and the day of Exit (if any). The days from completion to Exit were then 
worked out for exits occurring without previous casualty and the days from completion to 
November 30, 1944, inclusive for cards still in existence at the termination of the observations. 
Tne total of the corresponding days was then entered on all cards in columns 34-37. 

The step now to be described is the construction of a double entry table for Days in Service 
0-49, 50-99, . . . (to be called age groups) and builders A, B . . . G. Each cell of this 
table should contain (1) the number of those vessels which suffered their first causalty during the 
period, (2) the number of those vessels which did not suffer any casualty during their period of 
exposure, and (3) the total of the numbers in columns 34-37 for all cards belonging to the 
cell. 

The cards were sorted and collected into age groups. Within each age group a further sorting 



154 


Vajda — Statistical Investigation of Casualties 


[No. 2, 

Table I. 


Ago 

Group. 

BUILDER. 

A 



B 



C 



D 


(1) 

(2) 

(3) 

(1) 

(2) 

(3) 

(1) 

(2) 

(3) 

(1) 

(2) 

(3) 

1 

18,1.04 

12 

6-6 

17,138 

13 

7-6 

17,031 

5 

2*9 

23,557 

1 

0*4 

2 

16,257 

12 

7-4 

15,618 

13 

8 3 

16,887 

17 

10*1 

‘23,370 

3 

1*8 

S 

14,.085 

8 

.0-6 

14,376 

8 

5-6 

16,604 

10 

60 

23,125 

4 

1*7 

4 

13,190 

5 

3-8 

13,267 

10 

7-5 

16,1.01 

1.0 

9*3 

22,446 

5 

2-2 

5 

11,8.01 

17 

14-3 

1 1 ,943 

12 

100 

1.0,827 

10 

6*3 

20,807 

14 

6*7 

» 

10,710 

13 

12 1 

10,8.08 

10 

9*2 

15,306 

10 

6*5 

18,858 

11 

5*8 

7 

0,870 

7 

71 

10,010 

6 

60 

14,4.03 

17 

11*8 

17,049 

7 

41 

H 

8,. 096 

11 

12-8 

9,018 

5 

5-5 

12,864 

11 

8*6 

14,659 

12 

8*2 

0 

7,404 

5 

6-7 

8,33.0 

fi 

60 

11,886 

11 

9*3 

12,373 

3 

2-4 

10 

6,430 

6 

9 3 

7,460 

6 

80 

10,722 

14 

13*1 

10,306 

10 

9*7 

11 

5,654 

6 

10 6 

6,092 

3 

4-9 

9..392 

8 

8*5 

8,452 

9 

10*6 

12 

4,565 

4 

8-7 

4,84.0 

4 

8-2 

8,092 

3 

3*7 

6,875 

1 

1-6 

13 

3,584 


00 

3,707 

3 

8 1 

6,676 

6 

90 

5,535 

3 

6*4 

14 

2,908 

i 

3 4 

2,935 

4 

13-6 

.0.4.39 

2 

3 7 

4,403 

• ■ 

0-0 

ir. 

2,270 

1,666 


0 0 

2,161 


* 0 0 

4,400 


00 

3,372 


0*0 

16 


(I 0 

1,512 


0 0 

3,247 


0*0 

2,205 


0*0 

17 

1,010 


00 

1,063 


00 

2,412 

3 

8-3 

1,121 


0*0 

18 

428 


00 

578 

i 

17 3 

1 .706 


0 0 

522 


0*0 

19 

30 


0‘0 

138 

1 

72 5 

617 


00 

230 


0*0 

20 







178 


0*0 

89 


0-0 

21 







18 


0 0 

11 


00 

22 













Totals 

139,168 

107 

7-7 

141,054 

104 

7*4 

189,903 

141 

7-4 

2 19, .365 

83 

3*8 

1 

17,9.00 

11 

6- 1 

16,910 

13 


16,83.0 

5 

3*0 

23,531 

Table IL— 

1 ' 0*4 

2 

1.0,404 

10 

6 5 

14,593 

11 

7-5 

16,118 

17 

10 5 

23 230 

3 

1*3 

a 

13,420 

8 

6-0 

13,110 

.0 

3 8 

1.0,4.36 

7 

4 • .0 

22,896 

3 

1-3 

4 

1 1,743 

3 

2 6 

J 1 .626 

9 

1 • 4 

14, .037 

10 

6 9 

22,122 

4 

1*8 

r> 

9,959 

1.0 

15 1 

10,040 

7 

7-0 

13,779 

8 

.0 8 

19,988 

12 

6*0 

6 

8,410 

8 

9-5 

8,596 

7 

81 

12,926 

6 

4 0 

17,749 

9 

5*1 

7 

7,560 

3 

4*0 

7,600 

6,.074 

.0 

6'6 

11,713 

11 

9*4 

15,537 

5 

3*2 

8 

6,219 

9 

14 5 

2 

30 

9,723 

6 

6*2 

12,982 

6 

4*6 

0 

4,804 

3 

6-2 

5,977 

2 

3 3 

8, .065 

6 

7*0 

10,548 

1 

0*9 

10 

3,886 

3 

7 7 

5,174 

2 

3-9 

7,357 

8 

10*9 

8.307 

6 

7*2 

11 

3,255 

3 

9 2 

4,103 

2 

4 9 

6,3 12 

6 

9-.0 

, 6,336 

5 

7*9 

12 

2,678 

2 

7'5 

3,110 

i 1 

3*2 1 

5, ‘258 

2 

3*8 

:• 5,011 

1 

2 0 

13 

1 ,956 


0 0 

2,111 

! 1 

4-7 ' 

4,242 

i •''* 

11*8 

1 3,9.52 

3 

7*0 

14 

1 ,562 

’i 

6-4 

1 ,395 

1 

7 2 , 

3,318 

1 

3*0 

i 3,208 


0*0 

ir> 

1,240 


00 

1 .0.01 


00 1 

2.738 


0*0 

1 2,.5-24 


0*0 

16 

985 


00 

731 


00 ; 

3,113 

1 ’ ’ ! 

0*0 

i 1,7‘20 


0*0 

17 

612 


o-o 

528 


i 

- 1,585 

1 *2 

12-6 

924 


0*0 

18 

341 


00 

275 1 


0 0 I 

1,062 

1 

; 00 

453 


0*0 

19 

30 


0 0 

71 1 

’i 

140 8 ! 

310 

1 . . ' 

! 0*0 

1 180 


0*0 

20 







67 

1 

1 0-0 

87 


0*0 

21 

22 

i ! 

1 ! ! 




i 

13 

i-_ J 

1 00 

; 11 

1 


0*0 

Totals 

112,044 

79 

7-1 

113„07.0 

00 

61 

' 154,067 

1 100 

1 OTi 

1 201,302 

59 

2*9 

1 1 

, 204 

1 

49-0 

228 


0*0 

196 


00 

26 

Table TIT.— 

.. 1 0*0 

2 

«.03 1 

2 

23*4 

1.025 

2 ! 

19-5 

769 

i ■ ■ ' 

0*0 

134 


0*0 

3 

1 1,16.0 1 


! 0-0 

1,266 

.3 

23 7 1 

1,10H 1 

1 *3 ' 

25 7 

2‘29 

i 

43*7 

4 

; 1,4.03 I 

i - 

1 13*8 

1.641 

1 

61 j 

1,614 

1 ^ 

31*0 

324 

1 

30*9 

5 

1 ^892 

1 2 1 

1 10-6 

14)03 

5 

26*3 ' 

2,048 

1 2 

i 9*8 

819 

9. 

24*4 

0 

2,:}oo 

1 5 

1 21-7 

2,262 

3 

13 3 : 

2,389 

! ^ 

i 16 8 

1,109 

2 

18*0 

7 

2,:no 

I 4 

1 17-3 

•2,410 

1 

4*1 1 

•3,740 

6 

21*9 

1,512 

9 

13*2 

8 

2,.i 1 1 

' 2 

! 

2,4 14 

3 

12*3 1 

3.141 

' 5 

15 9 

1,677 

0 

35*8 

9 

2,600 

1 2 

7 ■ 7 

2,3.08 

3 

12*7 , 

3,321 ; 

.0 

15*0 

1,825 

2 

10*9 

10 

2..044 


i 11-8 

‘2,286 

4 

17 5 

3,36.0 

1 6 

17*8 

1,099 

4 

200 

U 

2,399 


1 12'5 

1,989 

1 

5 0 

3,050 

; 2 

0*5 

2,116 

4 

18*9 

12 

1,887 

2 

l()-6 

1 ,735 

3 

17*3 

2.M34 

1 1 

3*5 

1,804 


0*0 

13 

1,628 

i •• 

; 00 

1,.096 

.) 

12*5 

1 3,434 , 

1 1 

4*1 

1,583 ' 


0*0 

14 

1,346 


00 

1,.040 

3 

19*5 

2,121 

: 1 

4*7 

1,195 ! 


0*0 

ir> 

1 1 .0.30 


1 00 

1,110 


0*0 

1,662 


0*0 

848 


0*0 

10 

681 


1 0 0 

781 


0*0 

1,104 


0 0 

485 1 


0*0 

17 

368 


: 00 

53.0 


0*0 

827 

1 

0*0 

197 I 


0*0 

18 

i 87 

1 ' ] 

' 0 • 0 

303 

i 

33*0 

644 


0*0 

69 1 


0*0 

19 




07 


U 0 

307 


00 

50 1 


00 

20 







111 


00 

2 1 


00 

21 

22 



1 

1 


.. 

!! 


1 !: 




Totals 

27,124 

i 

1 10-3 

27,479 

35 

12*7 

35,836 

! 

11-4 

18,063 1 

24 

13*3 





— 











(1) « Days exposed. (2) ** Casualties. 



Suffered by Certain Types of Vessels 


155 


1947] 


All Casualties. 


Age 


E 



E 



0 


Totals. 


Groupft. 

(1) 

(2) 

(3) 

(1) 

(2) 

(») 

(1) 

(2) 1 

(3) 

(1) 

(2) 

(8) 

1 

: 16,51.") 

13 

7-9 

17,892 

9 

50 

0,800 

24 

24-5 

120,087 

77 

0‘4 

2 

16,369 

16 

0-8 

17,300 

r> 

2 9 

8,997 

6 

6 7 

114.807 

72 

6-3 

3 

1 10,125 

1.5 

9-3 

16,515 

3 

1-8 

8,207 

4 

4-9 1109,537 

52 

4*7 

4 

15,788 

21 

13*3 

15,504 

3 

1-9 

7, .585 

4 

5 3 1 103.937 

63 

61 

5 

. 15,591 

15 

00 

14,4.35 

0 

4-2 

0,710 

6 

8-0 

97,164 

80 

8-2 

6 

15,351 

20 

13-0 

13,224 

1 

0-8 

.5,718 

4 

7-0 

90;025 

69 

7-7 

7 

1 14,848 

18 

12 1 

12,171 

3 

2-5 

5,134 

0 

11-7 

83;. 535 

04 

7-7 

8 

' 13,507 

17 

12-5 

10,418 

2 

1 9 

4,328 


11-5 

73,4.50 

63 

8-6 

1) 

12,250 

12 

9-8 

8,869 


00 

3,070 

2 

5-4 

64,787 

38 

5'9 

10 

10, .588 

13 

12 3 

7,457 

3 

40 

3,061 


0-0 

.56,024 

52 

9-3 

11 

I 9,222 

9 

9-8 

0,180 

2 

1 3-2 

I 2,480 

2 

8-1 

47,478 

30 

8-2 

12 

, 8,010 

1.3 

10-2 

5,195 

2 

1 3-8 

' 1,790 


00 

39,372 

27 

G-9 

13 

1 0,560 

7 

10-7 

4,064 

2 

4 9 

i 1,100 


00 

31,241 

21 

0-7 

U 

5,246 

1 

1 9 

3,318 

4 

1 12- 0 

060 

'i 

15-2 

24,909 

13 

5-2 

IT) 

4,140 

4 

90 

3,000 

1 

1 3 3 

' 3tM) 


0 0 

19,715 

.5 

2-5 

16 

3.170 

1 

3 2 

2,758 

2 

7*3 

i 204 


00 

14,702 

3 

' 20 

17 

2, .302 

2 

8 -.5 

2,0.50 


1 00 

, 95 

■ i 1 

0 0 

10,113 

4 

! 40 

18 

1,69.5 


0 0 

• l,.320 


00 

49 


00 

6,298 

1 

1-6 

10 

071 


0 0 

578 


1 00 


1 • • 


2,-564 

1 

1 3 9 

20 

31.5 


0-0 

255 


: 00 


I 


837 


00 

21 

10 


0 0 

140 


! 00 




1 180 


' 0-0 

22 




10 


! 00 




' 16 


; 0-0 

Totals 

188,708 

107 

10 4 

102,080 

49 

1 30 

; 69,954 

04 

9 1 

1,110,838 

; 744 

P 6-7 


First Casualties. 


1 

' 16,1.55 

13 

8-0 

i 17,60.5 

9 

51 

9,097 

21 

23-1 

! 11.8,1 13 

73 

6-2 

2 

15, .378 

13 

8 5 

, 10,705 

.5 

30 , 

7,882 

ry 

6-3 

1 109,316 

04 

5-9 

3 

14,67.5 

12 

i H-2 

! 15,745 

3 

1-9 ' 

6,072 

2 

2-9 

1 102,254 

40 

3'9 

4 

1 13,483 

10 

141 

i 14,692 

2 

1-4 

6,241 

4 

6-4 

94,444 

51 

5*4 

,5 

12,573 

10 

8-0 

' 13,3.58 

G 

4 5 

5,21 1 

5 

9-6 

' 81,908 

63 

7*4 

0 

11,720 

15 

12 8 

: 11,993 

1 

0-8 

4.242 

4 

0>4 

1 75,636 

.50 

60 

7 

10,823 

9 

8 .3 

10,903 

2 

1 -8 

3,636 

4 

11-0 

! 07,772 

.39 

5-8 

8 

9,316 

10 

10-7 

9.327 

2 

2 1 

2,933 

4 

13 ;6 

1 .57,074 

39 

6-8 

9 

7,916 

4 

.5-1 

' 7,888 


00 

2,256 

2 

8-9 

17,954 

18 

3*8 

10 

6,331 

7 

11 1 

6,494 

3 

4-6 , 

1,815 


00 

j 39,364 

20 

7-4 

11 

.5,163 

4 

7 7 

5,293 

2 

3-8 , 

1 ,.190 


0 0 

31,882 

22 

60 

12 

1,267 

5 

, 117 

4,365 

.. 

0 0 ' 

935 


0-0 

25,021 

11 

4-3 

13 

3.293 

2 

6 1 

1 3,311 

2 

60 ' 

578 


O'O 

19,473 

13 

0*7 

14 

' 2,618 


0 0 

, 2.537 

3 

11-8 

286 


0-0 

i 14,924 

0 

4-0 

15 

2,111 

*3 

ll-(» 

' 2,25 1 

1 

4 4 

106 


0-0 

1 12,111 

4 

3-3 

10 

1 ,696 

1 

.5 9 

i 1 ,988 

•> 

101 

127 

1 

0 0 

1 9,390 

3 

3*2 

17 

J,2II 

1 

8 3 

1,410 


0 0 

45 

1 

1 00 

' 0,345 i 

3 

4 7 

18 

910 1 


0-0 

; 92.5 


00 




.3,996 1 


0-0 

19 

198 


0 0 

340 


0 0 




' 1 .429 , 

1 ‘i 

7-0 

20 

172 


0-0 

191 


0 0 




517 1 


O'O 

21 

10 


0 0 

110 


00 




114 


0 0 

22 




16 


0 0 




i 


00 

Totals 

1 110.379 

128 

91 

147,5.37 


2-9 

53,812 

51 

9-5 

1 922,716 

529 

5*7 


Further Casualties. * 


1 

300 


00 

227 

• . 

0-0 

703 

3 

42-7 

1,044 

4 

20-6 

2 

991 

'3 

30-3 

604 

.. 

00 

1,115 

1 

9() 

5,491 

8 

11-6 

3 

1,4.50 

3 

20 7 

770 


0 0 

1,235 

2 

16-2 

7,283 

12 

16-5 

4 

2.305 


8-7 

812 

i 

12-3 

1 ,344 ; 


0 0 

9,493 

12 

12-6 

,5 

3,018 

5 

16-6 

1,077 

.. 

00 

1,499 ; 

‘i 

6 7 

12,256 

17 

13-9 

6 

3.631 

5 

13-8 

1,231 


0 0 

1,476 


00 

14,380 

19 

1.3-2 

7 

4,025 

9 

22-4 

1,268 

i 

7-9 

1,498 1 

2 

13-3 

1 5,763 

25 

15-8 

8 

4,251 

7 

JO- 5 

1,001 


(»0 

J ,395 1 

1 

7 2 

10,376 

24 

14-6 

9 

4,334 

8 

18-4 

981 


00 

1.414 


0-0 

16,833 

20 

11-9 

10 

4,2.57 

6 

14-1 

963 


0 0 

1,246 j 


00 

10,660 

23 

13-8 

11 

4,059 

5 

1 12-3 

893 


0-0 

1,090 1 

2 

18 3 

15.590 

17 

100 

12 

3,743 

8 

, 21-4 

830 

'2 

24 1 

855 1 


0-0 

13,748 

16 

n-6 

13 

3,276 

5 

; 15-3 

723 • 


0-0 

528 1 


00 

i 11,768 

8 

6-8 

14 

2,628 

1 

1 3-8 

i 781 j 

i 

12-8 1 

1 374 ! 

' i 

26-7 

9,985 

7 

• 7*0 

15 

2,005 

; 1 

1 5 0 

i 755 


0-0 

194 


0*0 

7,604 

1 

1 i-3 

16 

1,474 


1 0-0 

770 


00 1 

77 1 


6-0 

5,372 


1 00 

17 

1,151 

! ’i 

i 8*7 

040 


0-0 

I 50 1 


00 

1 3,768 

i 

; 2-6 

18 

' 755 

1 

; 00 

395 


00 i 

1 49 1 


0-0 

1 2,302 

1 

1 4-3 

19 

473 

1 

1 

00 

238 


00 




, 1,135 


; 00 

20 

143 


0*0 

64 


00 , 




320 


00 

21 




36 


00 

i . . i 



aa 


1 O'O 

22 







I 



1 


-l:.P 

Totals 

' 48,329 ' 

60 

1 14-3 

15,149 

1 

6 

3 3 

16,142 ; 

13 

80 

1 188,122 

' 215 

1 11*4 



; 

_ 

1 

1 


1 








(3) Rate of caflualties. 



J56 Vajda — Statistical Investigation of Casualties [No. 2, 

on column 6 took place. The groups were then separately fed into the Tabulator, and the result 
for the first group (0-49 days of exposure) is here given as an example : 


First Casualties 


Days of Exposure, 0-49. 


Yard. 

NumbfT of 
casualticH. 

Number of 
‘‘ No casualties.'' 

Total days. 

A 

11 

45 

1750 

B 

13 

37 

1510 

C 

5 

4 

185 

D 

1 

4 

131 

E 

13 

2 

355 

F 

9 

9 

515 

G 

21 

16 

847 


73 

117 

5293 


In this way all the columns of the table were produced on the Tabulator. The figures were then 
entered by hand into a schedule and the cross totals worked out and checked (Table C). 

A slightly different procedure had to be followed for the analysis of the aggregate of all 
casualties. The cards referring to vessels with casualties show in columns 34-37 the number of 
days up to the first casualty ; this period is, however, not the total period of exposure. The repro- 
duced H.C.’s already used in the analysis of fiist casualties could not, therefore, be used now, 
but had to be replaced by others, which show, in columns 34-37, the total of days exposed up to 
November 30, 1944 (incl.) (if there was no earlier exit), or up to exit if such occurred (modified 
H.C.’s). 

To understand the necessity for the altered method of procedure it must be remembered that 
the cards dealt with have been essentially reproduced History Cards, and that all casualties to 
the same vessel are recorded on the same card. But now all casualties are to be counted and 
registered appropriately, and it was therefore necessary to provide new Casualty Cards, which 
differ from the original C.C.’s in that they should all show in columns 34-37, not the total of days 
between the last but one and the last casualty, but the whole period from completion of the vessel 
to the casualty considered. This number is, of course, the total of the numbers appearing in 
columns 34-37, 41-44, 48-51, 55-58, 62-65, 69-72 on the H.C., the addition being carried out up 
to and including the field referring to the casualty for which the new card is being constructed. 

The preparation of the data for punching into columns 34-37 of the “modified C.C.” was 
carried out by means of the Rolling Total Tabulator. Owing to the limited number of counters 
(namely six), the totals for vessels with six casualties had to be made by hand. These tabulations 
were used as schedules for punching. 

The construction of a table corresponding to Table C was now made in two steps. The first 
step gave, for every combination of builder and age group, the total of days exposed, and the 
second step gave the casualties for the same combinations. This time, if a vessel had suffered two 
casualties, say, after 95 and 208 days, and had its exit after 852 days, then it would be counted as 
one casualty in group 2 (50-99 days) and one in group 5 (200-249 days), and would count for 
852 days exposed in group 18 (850-899 days’ exposure). The final result is given in Table D. 

So far we have constructed tables which subdivide the vessels into groups according to their 
total length of observation. This led to Table V. In view of problem (ii), however, we arc also 
interested in events happening at certain ages, and for this purpose we have to find out how many 
vessels were observed at ages 0-49 days, 50-99 days, etc. 

Such a table had to be worked out for First Casualties, for All Casualties, and under each of 
these headings for each builder separately. We illustrate its construction for the total of All 
Casualties. Let us consider the row totals of Table D and take, as an example; the fifth line. 
There were altogether 99 vessels, every one of them exposed for a total period of not less than 



1947] 


Suffered by Certain Types of Vessels 


157 


200 and not more than 249 days. The total of all days exposed of these vessels was 22,564 days. 
We want to calculate in respect of all vessels the total of all days which were spent at ages from 200 
to 249 days. Hence we must first take away from the 22,564 days mentioned above those 99 x 200 
days which were spent at earlier ages. To the resulting difference we must add the days spent 
at age 200-249 days of all vessels which grew older and are recorded in the lower lines. Therefore 
we add all these vessels (169 f 83 -f . . . -f 1, i.e. 1888) and multiply their number by 50. This 
gives 94,400. The num^r of all days at ages 200-249 is therefore finally 22,564 — 19,800 4* 
94,400 ' 97,164. In this way the whole table was re-shaped. Table E shows the computations 
for “All Casualties.’' 

Similar computations were made for each of the builders, and also in respect of “First Casual- 
ties.” The results are incorporated in Tables I and II. Finally, to obtain the similar Table III 
for “Later Casualties,” the numbers in II were subtracted from those in I and the ratios 
recalculated. 


Table IV. 


Frequency Distribution of Intervals 


(1). 

(2). 

(1). 

^2). 

0). 

' 

(2). 

0-9 

24 

300-309 

'■ 15 

600-609 

5 

10-19 

12 

310-319 

9 

610-619 

1 

20-29 

31 

320-329 

i 6 

620-629 

3 

30-59 

35 

330-339 

, 11 

630-639 

3 

40^9 

23 

340-349 

7 

640-649 

2 

50-59 

29 

350-359 

’ 13 

650-659 

4 

60-69 

32 1 

360-369 

! 8 

660-669 

0 

70-79 

21 i 

370-379 

9 

670-679 

1 

80-89 

20 1 

380-389 

5 

680-689 

1 

90-99 

19 1 

390-399 

i 12 

690-699 

0 


i 



700-709 

2 

100-109 

16 

400-409 

2 

710-719 

2 

110-119 


410-419 

i 3 

720-749 

0 

120-129 

14 

420-429 

I . 6 

750-759 

2 

130-139 

17 

430-439 

' 7 ■ 

760-779 

0 

140-149 

18 ! 

440-449 

i 5 

780-789 

1 

150-159 

11 

450-459 

8 

790-799 

0 

160-169 

12 1 

460-469 

8 

800-809 

1 

170-179 

15 1 

470-479 

! b 

810-819 

1 

180-189 

10 , 

480-489 

i 7 

820-829 

0 

190-199 


490^99 

1 3 

830-839 1 

1 

200-209 

20 1 

500-509 

' 7 

840-909 

0 

210-219 

20 1 

510-519 

4 

910-919 

1 

220-229 

10 I 

1 520-529 

' 3 

i 

— 

230-239 

17 

530-539 

! 6 

Total j 

744 

240-249 

10 

540-549 

3 



250-259 

17 

550-559 

! 1 

(1) Intervals in days. 


260-269 1 

i 14 

560-569 

! 1 

(2) Frequency with 

which 

270-279 

14 

570-579 

1 5 

the intervals, specified 

280-289 

1 8 

580-589 

i 1 

in col. (1), preceded an 

290-299 

1 12 

590-599 

i ' 

accident. 




158 


Vajda — Statistical Investigation of Casualties 


[No. 2, 


Table V. 


Total fxpoBure 

til NCBHCl. 

Vcascls having had the following numb<‘r of oaaualtiofi 
within this time. 

Total of Number 
casual* of 

10,000 y Ratio 
casimlties 

J)ayw. 

0-49 

o'. 

117 . 

1. 2. 

1 . .. 

3. 

4. 

5. 


6. 

1 

. 118 


84-75 

50-99 

67 

3 . .. 






3 

70 


428*57 

100-149 

126 

3 . 1 






5 

. 130 


384-62 

150-199 

137 

5 . .. 






5 

. 142 


352-11 

200-249 

88 

10 . 1 






12 

. 99 


1212-12 

250-299 

154 

12 . 3 






18 

. 169 


1065-09 

300-349 

69 

13 . 1 






15 

83 


1807-23 

350-399 

208 

22 . 5 


J 




36 

. 236 


1525-42 

400-449 

195 

20 . 8 

. 1 . 





39 

. 224 


1741-07 

450-499 

83 

15 . 4 

. . . 





23 

. 102 


2254-90 

500-549 

140 

48 . 13 

2 


. 1 



85 

. 204 


4166-67 

550-599 

136 

45 . 13 

. 2 . 

3 

. 1 



94 

. 200 


4700-00 

600-649 

64 

17 . 10 

. 2 . 





43 

93 


4623-66 

650-699 

75 

36 . 10 

. 3 . 

3 

. 1 



82 

. 128 


6406-25 

700-749 

59 

42 . 17 

. 7 . 

2 



1 . 

111 

. 128 


8671-88 

750-799 

37 

11 . 6 

. 1 . 

1 




30 

56 


5357-14 

800 849 

63 

33 . 8 

. 5 . 





64 

. 109 


5871-56 

850-899 

64 

17 . 10 


1 




41 

92 


4456-52 

900-949 

20 

9 . 5 

. 1 . 





22 

. 35 


6285-71 

950-999 

10 

10 . 2 


. , 




14 

. 22 


6363-64 

1000-1049 

5 

1 . . . 






1 

6 


1666-67 

1050- 

1 








1 



Totals 

. 1918 

373 . 117 

. 24 . 

11 

. 3 


1 . 

744 

. 2447 






Table 

VI. 








AKHiiinption ; Airoldent rate independent of age. 

Calculated ilays (>X]>osed to * 




1st. 

2rid. 

3rd. 


4 th. 


5th. 

Ot h. J..ater. 


Akcm 111 
days. 

0-49 




Casualties. 






Totals. 

118,378 

. 1,685 . 

24 


0 






. 120,087 

50-99 

109,975 

. 4,629 . 

197 


6 


0 




1 14,807 

100-149 . 

101,962 

7,053 . 

497 


24 


1 




109,537 

150-199 . 

94,016 

. 8,978 . 

882 


58 


3 




. 103,937 

200-249 . 

85,405 

. 10,340 . 

1,301 


111 


7 


0 . 


97,164 

250-299 . 

76,893 

. 11,222 . 

1,718 


177 


14 


1 . 


90,025 

300-349 . 

69,335 

. 11,793 . 

2,123 


259 


23 


2 


83,535 

350-399 . 

59,241 

. 11,468 . 

2,370 


333 


35 


3 . 

0 

73,450 

400-449 . 

50,776 

. 10,989 . 

2,561 


406 


49 


5 . 

1 

64,787 

450-499 . 

42,668 

. 10,180 . 

2,639 


467 


62 


7 . 

1 

56,024 

500-549 . 

35,138 

9,141 . 

2,606 


508 


75 


9 . 

1 

47,478 

550-599 . 

28,315 

7,960 . 

2,473 


527 


85 


11 . 

1 

39,372 

600-649 

21,833 

. 6,582 . 

2,212 


511 


89 


12 . 

2 

31,241 

650-699 . 

16,916 

5,435 . 

1,962 


488 


92 


14 . 

2 

24,909 

700-749 . 

13,012 

4,430 . 

1,709 


455 


92 


15 . 

2 

19,715 

750-799 

9,467 

3,400 . 

1,395 


397 


86 


15 . 

2 

14,762 

800-849 . 

6,302 

. 2,378 . 

1,034 


312 


72 


13 . 

2 

10,113 

850-899 . 

3,813 

1,507 . 

691 


220 


54 


11 . 

2 

6,298 

900-949 . 

1,509 

622 . 

300 


101 


26 


5 . 

1 

2,564 

950-999 . 

479 

205 . 

104 


37 


10 


2 . 

0 

837 

1000-1049 . 

99 

45 . 

24 


9 


2 


1 . 

0 

180 

1050- 

9 

4 . 

2 


1 


0 


0 . 

0 

16 

Totals 

945,541 

. 130,046 . 

28,824 

. 

5,407 

. 

877 

, 

126 . 

17 

. 1,110,838 



1947] 


Suffered by Certain Types of Vessels 


159 


Table VII. 
All Builders. 


Ages. 1 


Expected cae 

ualtics. 

Observed casualties. 

Days. 

! ist. 

2!id. 

3rd. 

4th. 

5th. 

6th. 

Totals. 

1st. 

2iul. 

3rd. 

4th. j 5th. I 6tii. Totals. 

0-49 

07 -9 

1-93 

0-03 

0-00 

0 00 

0-00 

69*9 

73 ”* 

4 

“T7~ 


. ! 77 

50-99 

, 630 

5-29 

0-23 

O-Oi 

0-00 

0-00 

68-5 

64 

7 

1 

1 1 

72 

00^ 149 

i 58*5 

8*06 

0-57 

0-03 

0-00 

0-00 

67-2 

40 

10 

1 

’i ; :: ! 

. i 52 

50-199 

: 53 9 

10-20 

1-01 

0 07 

0-00 

0-00 

65-2 

51 

8 

4 


63 

;00-249 

, 490 

11-82 

1-49 

0-13 

0 01 

0-00 

62-5 

63 

15 

2 

:: j :: i 

. i 80 

:50 299 

44-1 

12-83 

1-96 

0-20 

0-02 

0-00 

.59-1 

50 

13 

K 

1 ; . . 

. i 69 

100-840 

: 39 7 

13-48 

2-43 

0-30 

0-03 

0-00 

55 - 9 

39 

23 

1 

] , . . i 

. , 64 

t.50-399 

, 34 0 

13 11 

2 71 

0-38 

0-04 

0-00 

50 2 

39 

18 

3 

3 { .. : 

63 

tOO-449 

1 29- 1 

12 - 56 

2 93 

0-40 

0-00 

0 -»)l 

45- 1 

18 

11 

5 

31 ; 

. 1 38 

t50-499 

24 • 5 

11-63 

3 02 

0-53 

0-07 

O-Ol 

39-8 

29 

15 

5 

1 2 

. ; 52 

>00-549 

' 2J)-1 

10-45 

2-98 

0.58 

0-09 

0-01 

34-2 

22 

10 

4 

2 ' T 

. 1 39 

>50-599 

16*2 

9-10 

2 83 

0 60 

0- 10 

O-Ol 

28 8 

11 

9 

3 

3 i . . ; 

1 1 27 

JOO-649 

1 12-5 

7-52 

2-53 

0-58 

0-10 

0 01 

23*2 

13 

6 

*> 


21 

150-099 

9-7 

0-21 

2-24 

0 - .56 

0 U 

0-02 

18-8 

6 i 

4 


. . 1 .1 

. 1 10 

rOO-749 

7*5 

5 00 

1-95 

0-52 

Oil 

, 0-02 

15 -2 

4 1 

h 1 

‘3 

1 _ ! 

8 

r50 -799 

5-4 

3 - 89 

1 • 59 

, 0-45 

0-10 

: 0-02 

11*5 




1 . . ! . . 1 

. ' 3 

BOO-849 

3-6 

2-72 

1-18 

i 0-30 1 

0-08 

1 0-01 

8 0 


! 1 



4 

sr)0^ 899 

' 2 • 2 

1 72 

0-79 

0 25 

0-06 

i 0-01 

5-0 


1 


1 ‘ * t 

. , 1 

900-949 

ij-o 

0 71 

0-34 

0-12 

0-03 

1 0-01 

2*1 




j •• ! *' 

. 1 1 

950-999 

0-3 

0-23 

0-12 

' 0-04 

0-01 

1 0-00 

7 


i ] ■ 



1 

00-1019 

01 

0 05 

0-03 

0-01 

0 - 00 

; 0-00 

. 2 




! • - i • • I 


50 - 

00 

0-00 

0 - 00 

1 0-00 

0 00 

0-00 

0 




1 .. .. 

i * ' 

.542 2 

118 63 

32-90 

' 0-18 

1-02 

1 0-14 

731-1 

529 

1.56 

39 

! 1 5 i 4 

1 1 744 


Table VIII. 

Assumption : Accident Rate Independent of Age. 

Builder 1). 


Builder F. 


Fir-st. Later. First. Later, 

(’asualtles. _ CasualtieH. 


.4gc.s Iti 
days. 

r- - 

CaJeuIated. 

Ob‘»er\<id. Calculated. Observed. 


(’aleulated. Observed. 

Calculated. 

Observed 

0-49 

. 6-8 


1 

. 0-2 


0 


5-2 

9 

. 00 


0 

50-99 

. 6-7 


3 

. 0-7 


0 


50 

5 

. 01 


0 

100-149 

. 6-5 


3 

. 11 


1 


4-7 

3 

. 0-2 


0 

150-199 

. 6-2 


4 

. 1-5 


1 


4-3 

2 

. 0-3 


1 

200-249 

. 5-7 


12 

. 18 


2 


40 

6 

0-3 


0 

250-299 

. 51 


9 

. 1-9 


2 


3-6 

1 

0-3 


0 

300-349 

. 4-5 


5 

. 21 


2 


3-2 

2 

0*4 


1 

350-399 

. 3-8’ 


6 

. 20 


6 


2-7 

2 

. 0-4 


0 

400^9 

. 3-2 


1 

. 1-9 


2 


2*3 

0 

0-3 


0 

450-499 

. 2-6 


6 

. 1-8 


4 


1*9 

3 

0-3 


0 

500-549 

. 21 


5 

. 1-6 


4 


1-6 

2 

0*3 


0 

550-599 

. 1-7 


1 

. 1*4 


0 


1*3 

0 

0-3 


2 

600-649 

. 1*3 


3 

. 1-2 


0 


10 

2 

. 0-2 


0 

650-699 

. M 


0 

. 10 


0 


0-8 

3 

. 0-2 


1 

700-749 

. 0-8 


0 

. 0-9 


0 


0-7 

1 

. 0-2 


0 

750-799 

. 0-5 


0 

. 0-6 


0 


0-6 

2 

. 0-2 


0 

800-849 

. 0-3 


0 

. 0-3 

*. 

0 


0-5 

0 

. 01 


0 

850-899 

. 01 


0 

. 0-2 


0 


0-3 

0 

. 01 


0 

900-949 

. 01 


0 

. 01 


0 


01 

0 

. 0 0 


0 

950-999 

. 00 


0 

. 00 


0 


01 

0 

. 


0 

1000-1049 

• • • 


0 

. . . 


, . 


00 

0 



0 

1050- 

. . . 



. .. 




... 0 




Totals 

. 59 * 1 

, 

59 . 

. 22*3 

, 

24 

. 

43-9 

43 

. 4-2 

. 

5 



160 


Vajda — Statistical Investigation of Casualties 


[No. 2, 


Specimen 1. 

Schedule for Punching. Basic Data for each Vessel. 


Name. 

(3). 

Reg. no. 1 Builder. 

(l).j (2) Construction 

al details. |(3). 

Date of completion. 


T 

2 3 4 5 1 

6 ; 7 ‘ 8 . 9 i 10 

in 

12 13 

14 15 16 1 17 18 19 j20 

21 22 23 24 25 

I 

1 

i 

i 1 

1 : 


' ' 

i 1 


i 

1 

coocoj 

j 

1 j 

1 

i 


(1) Enter code for builder. (2) Enter code for constructioiml detAilH. (3) No entries to be made here. 


1 

2 

3 

4 

5 


Schedule for Punching. 


Name. 



(1). Registered No. 

! 

1 12 3 4 5 


! 

i 


Specimen 2. 

Casualties. 


(5). Date of casualty. 1 

1 

(2). 

(3). 

1 (**)• 1 

1 

(5). 

Duration of spell | 
of service. 

20 26 27 28 29 30 

31 

32 

33 1 

34 35 36 37 

j 




(1) No entry to be made here. (2) Enter code for “ Cause." (3) Enter code for ** Result." (4) Ender eoile for " Fault." 
(5) No entry to be inotie here by the eollcctiug section. 


Table A. 


Explanation of All Columns and Headings on History Cards 


Column. Heading. 

1 . Class of Card 
2-5 . Number 
6 . Builder 

7-19 . Contructional Details 
20 . Number of Casualties 


21-25 . 

Date of Completion 

26-30 . 



31-37 . 

1st Casualty 


38-44 . 

2nd „ 


45-51 . 

3rd „ 


52-58 . 

4th 


59-65 . 

5th „ 


66-72 . 

6th „ 


73-75 . 
76-80 . 

Date of Exit 



This will always be punched “0” as a means of identification. 
The Registered No. 


Herein is recorded the number of casualties to this particular 
vessel. Where no casualties have occurred, there will 
only be a “0” punched. As further casualties occur, “1,” 
“2,” etc., will be added in the same column. 

Day (2), Month (1), Year (2). 

These columns are to remain unpunched on the History 
Card. (Compare Casualty Card, Table B.) 

Details of the Cause (1), Result (1), Fault (1) and Length of 
Spell of Service (4) in days between Date of Completion 
and First Casualty or between each Casualty. This is 
punched from the Casualty Card. 

Unpunched. 

Day (2), Month (1), Year (2). 


Note. — Cols. 34-37 had slightly altered meanings on History Cards referred tf) in the next section, viz * 
Reproduced H.C. : 

For vessels which suffered a casualty : Meaning as above. 

For vessels without casualty : Meaning altered to ** highest age attained, in days." 

Modified H.C. for all vessels, meaning altered to " highest ago attained, in days." 



Suffered by Certain Types of Vessels 


161 


Tables B, C and D (see pp. 162 and 163). 


Table E. 
All Casualties. 



(1). 

Numbers of 

(2). 

(a) 

Sum of 

(3). 

(4). 

(b) 

Itays exjmsed of 

(r»). 

(c) 

Days exposed 

(0). 

Days oxiK)sed at 

Age. 

vessels. 

uiimbers. 

Days exposed. 

vessels in 
liigher category. 

belonging to 
lower category’. 

given age 
(3) ^ (4) -- (.'>). 

0-49 

. 118 

. (2447) 

3,637 

116,450 


120,087 

50-99 

70 

2329 

5,357 

112,950 

3,500 

1 14,807 

100-149 

130 

2259 

16,087 

106,450 

13,000 

109,537 

150-199 

142 

2129 

25,887 

99,350 

21,300 

. 103,937 

200-249 

99 

1987 

22,564 

94,400 

19,800 

97,164 

250-299 

169 

1888 

46,325 

85,950 

42,250 

90,025 

300-349 

83 

1719 

26,635 

81,800 

24,900 

83,535 

350-399 

236 

. 1636 

86,050 

70,000 

82,600 

73,450 

400-449 

224 

1400 

95,587 

58,800 

89,600 

64,787 

450-499 

102 

1176 

48,224 

53,700 

45,900 

56,024 

500-549 

204 

1074 

105,978 

43,500 

102,000 

47,478 

550-599 

200 

870 

115,872 

33,500 

110,000 

39,372 

600-649 

93 

670 

58,191 

28,850 

55,800 

31,241 

650-699 

. 128 

577 

85,659 

22,450 

83,200 

24,909 

700-749 

128 

449 

93,265 

16,050 

89,600 

19,715 

750-799 

56 

321 

43,512 

13,250 

42,000 

14,762 

800-849 

109 

265 

89,513 

7,800 

87,200 

10,113 

850-899 

92 

156 

81,298 

3,200 

78,200 

6,298 

900-949 

35 

64 

32,614 

1,450 

31,500 

2,564 

950-999 

22 

29 

21,387 

350 

20,900 

837 

1000-1049 

6 

7 

6,130 

50 

6,000 

180 

1050-1099 

1 

1 

1,066 


1,050 

16 

btals . 

. 2447 


. 1,110,838 

. 1,050,300 

. 1,050,300 

. 1,110,838 


(a) Siiininaiion from tlio bottom upwards, 

(b) l.«*. iiumbiT in column (’2), one line furt.be 
(e) I,e, numbers in eoluinn (1) inultlplle<i by 


•r down, multiplied by 50 (e.K. 50 v ^250 ^ 1 Hi, 950), 

50, JOO, 150, 200, etc., up to 1100 (e.g. 1000 0 - 0000). 


SUPP. VOL. IX. NO. 2. 



162 

Vajda- 

Column. 


1 . 

Class of Card 

2-5 . 

Number 

6 . 

Builder 

7-19 . 

Constructional Details 

20 . 

Number of Casualty 

21-25 . 

Date of completion 

26-30 . 

Date of Casualty . 

31-37 . 

Latest Casualty . 


Statistical Investigation of Casualties [No. 2, 

Table B.— 

. Casualty cards will be punched “X” (the position above 
the “0” hole). This leaves the 1-9 range of t)unch- 
ing for special investigation designations where separate 
card files arc created. 


VAs recorded on, and punched from the History Card. 


The Day (2), Month (1), Year (2) of the latest Casualty. 

The Cause (1), Result (1), Fault (1) and Length of Spell of 
Service in days (4), for the most recent Casualty. In all 
cases except the First Casualty the details will appear 
within the “Latest” Casualty columns and their own 
particular field. 


Table C. — 


Age 

Oronp. 

JU'II.DKH. 

A 

(1)" ' (2) 

(8) 

(1) 

14 

(2) 

(8) 

(1) 

(■ 

(2) 

(8) 

(1) 

I) 

(2) 

(0) 

1 

1 ,750 

' 11 

50 

1,510 

18 

50 

18.5 

5 

9 

181 

1 

5 

2 

2.204 

10 

80 

1,998 

11 

2S 

1 ,808 

iV 

19 

480 

8 

0 

8 

5,770 

« 

47 

4,000 

.5 

88 

1 ,880 

7 

! 4 

1,440 

8 

11 

4 

5,798 

8 

82 

0,470 

9 

30 

8,8.47 

10 

19 

0.872 

4 

84 

f) 

0.709 

15 

80 

5,240 

7 

28 

2,479 

8 

J 1 

8. 1 88 

12 

30 

» 

7,50(1 

K 

2.H 

8,190 

7 

80 

0,020 

0 

24 

14.899 

0 

54 

7 

8,210 

8 

10 

8,200 

.5 

JO 

7,11,4 

11 

2'i 

8.(487 

.5 

27 

H 

1 8.209 

9 

80 

7.97 4 

2 

OO 

14,128 

0 

89 

21 ,982 

0 

00 

9 

1 1 .H5 t 

8 

2H 

0,427 

2 

15 

12,815 

0 

80 

21,048 

1 

58 

](► 

4,780 

8 

10 

0,02 1 

2 

14 

7,107 

8 

15 

18,207 

(4 * 

28 

11 < 

5.05.5 

8 

11 

12, 4. 58 

2 

24 

! 18,0 42 

0 

20 

21,180 

.5 

; 41 

12 

1 1 .57H 

2 

20 

18,800 

1 

21 

1 12,7.58 

2 


10,101 

1 

28 

18 

8,750 


0 

8,701 

1 

14 

! 8.712 

.5 

u 

7,. 502 

8 

12 

14 

0,002 

1 

9 

7,845 

1 

11 

12,018 

1 

18 

1 0,0.58 


15 

ir> 

8,010 


5 

5,85 1 


8 

1 9,488 


18 

18,87 4 


19 

19 

8,HS.', 


5 

2,881 


8 

i 0,998 


9 

7,770 


10 

17 

5.742 


7 

4,128 


5 

1 9,085 

2 

11 

14,771 


18 

1H 

i 7,941 

1 •• 

9 

4,875 


5 

1 15,912 


18 

0.208 


7 

19 

980 


, 1 

2,771 

i 

8 

! 0,510 


7 

980 


1 

20 ; 







907 


1 

1.987 


2 

V, 







1 1,018 

1 


1 

1,011 


1 

Totals 

! 

112,044 

1 

79 

880 

118,575 

09 

8.58 

j 154,0(47 

100 

842 

201,802 1 

59 

478 


(1) Kiiniber of days wlilch tin* vohscIs tiiuiini'nitod in c(tl. (:J) exposed to risk before 


1 

1,404 

12 

' 45 

o 

1,007 

12 

, 21 

8 

4,785 

8 

1 39 

4 

5,440 

5 

80 

5 

8.001 

17 

10 

0 

7,000 

18 

20 

7 

8,220 

7 

! 10 

8 

10,940 

11 

i 80 

9 

12,804 

5 

29 

10 

4,780 

0 

10 

11 

1 1 ,004 

0 

21 

12 

15,015 

4 

27 

18 

5,084 


1 9 

14 

10,058 

i 

15 

15 

10,920 


; 15 

16 

0,210 


1 8 

17 

18.100 


1 10 

18 

10,578 


1 1" 

19 

980 


i 1 

20 




21 




22 


• • 


Totals < 

189,108 

1 107 

a-'O 


1 ,088 1 

18 

87 

181 

1 ,808 

18 

IS 

187 

8,420 i 

8 

2vS 

1,004 

4,917 

10 

27 

1,7.51 

4,048 

12 

18 

077 

0,8.58 

10 

25 

5,250 

2,500 

6 

8 

8,858 

8,8(48 

5 

23 

12,204 

0,83:) 

5 

10 

11,980 

7, .500 

0 

10 

0,022 

10,592 

8 

82 

17,112 

10,745 

4 

29 

19,742 

9,407 

8 

15 

9,870 

9,485 1 

4 

14 i 

17,889 

18,811 1 


19 

21,900 1 

4,002 1 


6 

8, .547 

9,008 i 


11 

15,012 

10, ,578 ' 

*i 

12 

21 ,80(4 

8,788 1 

1 

4 

9,817 




4,878 




1,018 

141,054 

j 104 

! 358 

1 189,908 


Table D. 


.5 

4 

107 

1 i 

4 

17 


220 


8 

10 

8 

1,175 

4 

0 

15 

10 

5,590 

5 

.80 

10 

8 

5,957 

14 i 

26 

10 

19 

12,908 

11 

47 

17 

12 

7,049 

7 i 

22 

11 

84 

20,459 

12 : 

56 

11 

28 

24,078 

^ 1 

58 

14 

14 

10,40(4 

10 ' 

22 

8 

88 

22,802 

9 

48 

8 

.84 

20,225 

1 

85 

0 

15 

9,885 

8 

15 

2 

20 

, 10,758 


25 


80 

19,722 


27 


11 

1 1,055 


15 

' 2 

19 

18,871 


23 


24 

7,072 


8 


10 

980 


1 


5 

2,889 


3 


1 

1,011 

• • 1 

1 

141 

342 

219,805 

' 88 

473 


(1) Number of da>a exposed of vessels enumerated in col. (8). 



1947] 


Suffered by Certain Types of Vessels 


163 


Casualty Card, 

Column. Heading. 

38-44 . 2nd Casualty 
or 

45-51 . 3rd 
or 

52-58 . 4th „ > 

or 

59-65 . 5th 
or 

66-72 . 6th 
73-80 . 

Note . — On tlie “ Modified Casualty Canis 


Details of Cause (1), Result (1), Fault (1) and Length of 
Spell of Service (4) for each Casualty. 


Un punched. 

ooliimns Higiiified age at easiialty, in days. 


"irst Casualties, 


E j F I (J Totals. 


Age 

Group. 

(1) 

(‘4) 

(d) 

(1) 

(2) 

(2) 

1 0) 

(‘2) 

1 (3) 

(1) 

(‘2) 

j (3) 

1 


Ki 

1.5 

51.5 

0 

, 18 

' 817 

24 

1 d" 

5.203 

73 

' 100 


I,17S 

]:i 

1(1 

i.‘25r» 

5 

, 17 

' 1,132 

5 

15 

0.5(10 

(14 

, 131 

:< 


12 

17 

3,045 

3 

24 

; 2,472 

2 

20 

20,854 

40 

' 100 

t 


10 

25 

4,502 

2 

2.5 

3,141 

4 

17 

34,044 

51 

JSH 

ft 

2.C7;J 

10 

12 

4 , 75 s 

() 

21 

4.001 

5 

18 

34,058 

03 

1 151 

fi 

r>,4‘2() 

15 

‘20 

8,103 

1 

30 

4,802 

4 

IK 

.5.5,780 

50 

1 204 

7 

5.47:^ 

0 

j 7 

4,853 

2 

15 

i 2,23(4 

4 

7 

34.722 

30 

' 108 

s 

i‘2,sor, 

10 

25 

1.3,577 

2 

37 

(1,033 

4 

18 

00,42 4 

39 

1 247 

!) 

1 4,0()() 

4 


14,488 


1 34 

' 5.. 50(1 

2 

i 13 

00,704 

18 

213 

M) 

lO.fiSl 

7 

22 

8,404 

3 

; 18 

2,305 


5 

52,01 1 

20 

1 112 

11 

1 ‘2,.') 1 :i 

1 

24 

13,103 

2 

' 20 

5 100 


10 

HI, 1:52 

«•> 

1 102 

12 

12,217 

5 

21 

12,105 


21 

1 0,335 


! 11 

85,074 

11 

147 

Ifi 

7.10‘{ 

o 

12 

0,001 

2 

1(1 

; 1 ,.S78 


3 

48.123 

13 

: 77 

It 

7,:us 


1 1 

7, ‘287 

3 

11 

■ 3.08(4 


0 

54,074 

0 

1 81 

1.') 

7,0 41 

2 

J 1 

4,351 

1 

0 

710 


1 

45.801 

4 

03 

in 

fi, 1 00 

1 

H 

3,08H 

2 

4 

777 


1 

31,040 

3 

1 40 

17 

(4,.') 1 1 

1 

S 

, 12,200 


, 15 

1,015 


2 

54.in>5 

3 

; 00 

IS 

S.S40 


10 

13,275 


15 



1 

50, .5411 


04 

10 

o.rios 


(1 

2,700 


3 




10,520 

*1 

1 21 

20 

4,S72 


r, 

1 ,04 1 


2 




0,717 


1 

‘21 

1.010 


1 

2.0«)0 


1 2 




5.004 


1 

"I''* 




1,00(1 


. 1 

! 



1 ,000 


J 

Totals 

140,:{70 

12S 

2 : 1 1 

1 4 7, .537 

43 

' 301 

I .53,812 

1 

51 

1 202 

022,710 

520 

• 2,447 


uirciirig their first easualty (‘2) ('asiialties. (fi) Ves-.el.s, listed according to age at first casualty. 


ill Casualties, 


1 

05 

13 

2 

202 

9 

<) 

550 

24 

17 

3,(4;{7 

77 

J 18 

2 

210 

1(4 

.3 

004) 

5 

12 

847 

0 

n 

5,357 


70 

3 

875 

15 

7 

2,005 

3 

21 

2,207 

4 

18 

10,087 

5-2 

130 

4 

1,038 

21 

0 

4,354 

3 

1 24 

2,785 

4 

15 

25, .887 

03 

J42 

5 

041 

15 

4 

3,435 

0 

15 

3,010 

(4 

17 

22,504 

80 

90 

0 

1,701 

20 

() 

7,924 

1 

20 

1,0J8 

4 

17 

40,325 

09 

109 


3.198 

18 

10 

1,.521 

3 

1 t 

2,234 

(4 

7 

20.035 

01 

83 

8 

11,717 

17 

3‘2 

14,008 

2 

41 

7,328 

5 

20 

8(4,050 

03 

230 

0 

17, -200 

12 

40 

1(4,100 


38 

0,420 

2 

15 

05,587 

38 

2-24 

10 

8,038 

13 

17 

7,5.57 

’3 

i 10 

3,311 


7 

48,224 

52 

102 

11 

10,072 

9 

31 

15,030 

2 

20 

7.830 

2 

15 

105,078 

39 

204 

12 

18,500 

13 

32 

14,545 . 

2 

25 

10,440 


18 

115,872 

27 

200 

13 

11,800 

7 

10 

8,7(44 

2 

14 

3,750 


0 

58,101 

21 

03 

14 

18,740 

1 

; 28 

7,318 1 

4 

; 11 

0,010 

1 

0 

85,0.50 

13 

]‘28 

15 1 

18,040 

4 

j 2(4 

4,350 1 

J 

. 0 

3,(410 1 


5 

03,205 

5 

128 

16 

7,770 

1 

! 10 

3,108 

2 

; 4 

1,554 


2 

43,512 

3 

50 

17 

1.5.(412 

2 1 

1 10 

15,550 


10 

1,045 


2 

89,513 

4 

100 

18 

13.245 


' 15 

17,020 1 


1 20 

809 


1 

8J.208 

1 

0‘2 

10 

1 1,171 


1 

(4,528 ' 


7 1 




32,014 

1 

35 

20 

10,715 


' 11 

2,005 1 


■ 3 j 




21,387 

, . 

2*2 

21 

1,010 


1 1 

3,000 1 


1 d 



1 

0, ! 30 



22 




1,000 





1 

1 

1 000 


J 

Totalsj 

188.708 

107 

331 

10‘2,086 

48 

, 361 

1 00,054 

04 

j ‘202 

1 1,110,838 

r-'“: 

[ 2,447 


2) Casualties. (^ 6 ) Vessels, listed according to highest age attained. 



164 


Discussion on Dr. Vajda*s Paper 


[No .2, 


Discussion on Dr. Vajda’s Paper 

Mr. Maddex: I feel I am not a very suitable person to propose this vote of thanks or to take 
part in the discussion of Dr. Vajda’s paper because, as a practising actuary, I confess that I have 
more interest in results than in methods, whereas it is quite clear Dr. Vajda is more interested in 
methods than in results. Nevertheless I am glad to have the opportunity of making what little 
contribution I can from my limited reading of the paper. 

The actuary’s approach to problems of this kind is severely practical. He seeks to express 
an experience in terms of a series of rates, breaking the rates down into their simplest components. 
Hence his primary concern, perhaps, is to get as good an approximation as he can to what, in the 
actuary’s Jargon, is called the “exposed to risk’’ and to ensure the validity of his results as a basis 
for forecasting. 

Dr. Vajda has gone a long way along the path of the actuary. One takes considerable pleasure 
in watching his careful build-up of the exposed to risk, and in the extraordinarily full and valuable 
tables which he has produced with the aid of his Hollerith tabulation system. Whether, with the 
numbers involved -which are really rather small — the Hollerith system is economical as well as 
efficient, 1 do not know. 1 should have thought that a hand-written and hand-sort basis of 
tabulation might have been more flexible, but the results given in the appendices to the paper 
speak for themselves. 

Problems of this sort are not strange to the actuary. He has developed a well-established 
technique, more particularly in relation to the analysis of sickness and mortality and accident 
experiences, but generally he has to do with relatively large samples and is not confronted with the 
statistical problems which are the special reason for Dr. Vajda’s paper. At times, however, he 
has to make up his mind on similar questions in a small experience; 1 would instance the examina- 
tion of the mortality, and still more the sickness, experience of a small Friendly Society. There it 
may be extraordinarily difficult to say whether or not a particular sickness experience represents a 
significant deviation from some standard, even if it be only the standard of the Society’s own 
previous experience. 

You have in such cases also a problem which Dr. Vajda has avoided by not considering the 
duration of his casualties. The sickness rate can most simply be considered as a compound of 
two components: an attack rate (corresponding to the author’s accident rate) and a duration of 
claim. The theoretical problem which that presents seems somewhat intractable, and while 
from a practical point of view the analysis of sickness in those terms is an elementary step, I have 
yet to see any development of .statistical theory which is at all helpful in considering it. I should 
like to think that Dr. Vajda’s paper is a first step in that direction. However, when it comes 
to a critical examination of his mathematical analysis I confess that 1 am. not — to use' his own 
rather disarming phrase competent in statistical theory; nor am 1 versed in the literature on 
which he draws. Indeed, when \ get to that stage, like Goethe in another connection, T gaze in 
wonder; 1 do not seek to understand. 

You will appreciate, therefore, that I do not share the disappointment which Dr. Vajda 
expresses at the fact that his tables allow' conclusions to be reached almost by inspection. Dis- 
appointing as that may be to the mathematician, I should have thought it a tribute to the statistician, 
because, as I see statistical investigation on what may be called the non-mathematical side, its 
(object is so to subdivide, analyse and generally pull to pieces the figures which the data present 
as to bring out these results; and it seems to me that that is evidence as much of statistical com- 
petence in one direction as the more subtle mathematical analysis after which Dr. Vajda strives. 

1 am curious to know a little more about the material which the author has used. One 
recognizes the limitations which he is under, but 1 should like to know whether the accidents with 
which he is concerned are solely breakdowns due to internal causes, or whether, whilst excluding 
war casualties, they include what might be properly called marine hazards. This is a relevant 
point, because it affects not only one’s approach to the incidence of the casualties, but also the 
possible interpretation of the various analyses made. I take it that a casualty which results in 
permanent withdrawal is included both among the accidents and among the exits; this is to be 
presumed, but it was not quite clear from the paragraph at the beginning of the paper. 

1 would like to know a little more also about the nature and effect of the exits. In this 
experience it is probably not material, because they are less than 10 per cent, of the entries, but 
one would like to be sure that they were in no sense selective, that is to say that the withdrawal 
of these individuals was not in any way due to their own accident experience, whether on the 
favourable or the unfavourable side. The point is a commonplace to actuaries, because the 
influence of selective withdrawal in affecting the experience of the residual population is an element 
with which actuaries are only too familiar, and which they have to watch in interpreting their results. 

Another point on which one would like to hear the author’s justification is his method of 



19471 


Discussion on Dr. Vajdcts Paper 


165 


taking casualties as isolated occurrences in a continuous exposure running over the whole of the 
three years, making no deduction for the time during which a casualty is in dock, or, indeed, for 
the time during which vessels in the normal course are in harbour and therefore not exposed to the 
casualty risk. Both types of temporary withdrawal from the experience may be important in 
themselves, and while one accepts the author's statement on the point, I gather he is referring only 
to accidents and not to normal time spent in harbour. The effect of making no deduction may 
be small, but one would like to be assured that the repercussions have been fully considered. 

It is difficult to comment without knowing more about the types of vessels, but let us suppose 
that merchant vessels were under consideration. Very roughly — and these are illustrative figures 
only — supposing the average round-voyage time is about four months, about a month of that 
time might be spent in port, moving cargo, turning round, bunkering, and so forth, and another 
ten days in repairs; you thus dispose of about a third of the round-voyage time as time during 
which the vessel is not exposed to risk. That is an average, but the proportion would vary greatly 
between vessels employed in different oceans. The North Atlantic, with a shorter round-trip 
and a higher marine risk, might possibly show half-time not exposed to risk. If. therefore, the 
vessels in the experience were engaged in different oceans, that would be a factor which might be 
just as important as any of the other factors considered in influencing the casualty rate. 

1 think this consideration is important, because the later stages of the investigation are not 
an analysis of isolated non-reversible incidehts (like deaths for instance), but of life histories in 
which the individual may retire temporarily, endure a period of invalidity or other non-exposure, 
and return again to the active list several times before he dies or disappears (assuming he does not 
last out the static three years’ period). It should be remembered also that the author does not 
follow through his experience as is done in what an actuary calls a “ select investigation ” ; he is 
unable, that is, to follow it from the point of entry at completion of the vessel, year by year or 
month by month over the same number of months for every vessel, but has to stop at the end 
of his three years — a serious defect from the point of view of the specific objects of the analysis. 
It is difficult to believe that the large proportion of time which is spent out of active service is not 
a factor of importance when you come to investigate the casualty rate on a durational basis — more 
particularly when you look at second and later casualties, for which the members are relatively 
small and the casualty rates heavy. It seems to me that it is not only a question of under- 
estimating the casualty rate but, more important, of distorting it. 

That leads to a somewhat similar point, in essence. Table 5, which is a preliminary tabulation 
according to builder, is very interesting but rather tantalizing, Ixjcause it is not quite clear to me 
without further explanation precisely what those figures represent, giving durations from comple- 
tion to first accident or from one accident to the next. The differences are certainly significant, 
but significant of what? The author has not stated, although he may have satisfied his own mind, 
that he has been able to eliminate or compensate for the effects of differential incidence of entry 
into the service, remembering the considerable seasonal variation in the casualty rate ; the differences 
due to service in different oceans; different proportions of time which would be spent in port, 
on different kinds of trip, and so forth. I do feel that, from the point of view of what might be 
called the arithmetical statistical investigation, those are important points, and 1 should like to 
know precisely what has been done about them, and how the author assures himself and us that 
they did not enter into his experience in such a way as to vitiate the attempt to compare in isolation 
the experience with reference to builders, age and successive accidents, on the assumption that 
there are no other variables vitiating the comparison. 

The difficulty of giving weight for age in these analyses seems to have weighed heavily on the 
author, and I wonder if he employed any method of standardizing his results by weighting accident 
rates for individual builders with the exposed to risk by age for the whole experience, and 
vice versa. You do not,- of course, get conclusive results that way, but I should have thought 
it was a useful first step; I expect the author has done it, but if he did, one would have liked 
to see the results before going on from that straightforward approach to his mathematical 
analysis. 

T should like to say what a pleasure it is to read a paper in which all the data are exposed so 
fully for the benefit of the reader, and in which, by the way, there is not a single table in which the 
numbers add up to a hundred. 

Mr. E. G. Chambers; As one who has been associated with the study of accident pronencss 
for a good many years I listened to this paper with considerable interest, and feel it should provide 
some very useful suggestions for people who are working in this type of field and with similar 
types of data. 

I am not qualified to criticize the more mathematical aspects of the paper, but there are two 



166 


Discussion on Dr, Vajda's Paper 


[No. 2, 


small points I should like to mention. First, as one closely associated with the original use of the 
term “accident proneness,” used to denote the personal psychological qualities entering into the 
human susceptibility to accidents, I must enter a gentle protest against the application of the 
term to ships.* I say a gentle protest advisedly, because l am only too conscious of the fact that 
some psychologists have borrowed and mis-applied statistical techniques, and it is perhaps only 
fair that statisticians should borrow and possibly mis-apply psychological terminology. Sailors, 
I believe, claim that each vessel has its own individuality. Does this individuality extend to cover 
what we call “accident proneness” in human beings? 

I have here a possibly very naive suggestion: that is that the question might be examined to 
see whether the distribution of all casualties which Dr. Vajda gives on the first page of his paper 
shows a good fit to a Poisson distribution, or if it shows a better fit to the Greenwood- Yule negative 
binomial distribution. I have not had the actual figures long enough to try this out for myself, 
but, in the absence of proof that the latter theoretical distribution does give the better fit, I suggest 
the use of the term “accident liability” with regard to ships instead of “accident proneness.” 

The other point has already been touched on by Mr. Maddex— that is, that Dr. Vajda has 
shown a marked relationship between casualty rates and month of the year, as may be seen in 
Table 4. 1 suggest that tl][is relationship must have a very strong effect on the time after launching 

before the first casualty. For example, a vessel launched in November has a much greater chance 
of an early casualty than one launched in, say, June. This same factor may have an effect also 
on the time of the first casualty of ships built by different builders. If some builders tend to launch 
their ships in the spring and others in the autumn (1 do not know whether that is so), then the 
unequal distribution shown in Table 6 may be in part accounted for by the fact of the unequal 
distribution of casualties during the seasons of the year. In any case that factor rather seems to 
me to have been neglected in the analysis. 

I have very much pleasure in seconding the vote of thanks. 

The vote of thanks was then put to the meeting and carried unanimously. 

Sir Wii I lAM Eldi.kton said he was in the same difficulty, to a very large extent, as the author, 
because having been mixed up with shipping a good deal during the War he had to be a little careful 
what he said, even at that stage, so that although he had guessed a good deal about what was 
behind the paper, he would have to be careful not to disclose all that he had guessed. 

He thought it safe to say that it looked as if the definition of a casualty must follow fairly 
nearly the sort of accident that might occur and be covered by insurance, plus, perhaps, a certain 
number of things that might not be covered. The fact that the accident rate was so much higher 
in winter than in summer agreed well with the general experience of all merchant shipping -that 
you do get a higher casually rale and, of course, a higher loss rate, in winter than in summer. If a 
vessel had any weakness and had a bad shaking across the Atlantic in the sort of weather experienced 
lately, especially if she had had to slow up on various occasions because submarines were about, 
or quicken for other reasons, or suddenly turn into a different direction, then that defect would be 
more readily shown up in those circumstances than in smoother waters. 

If you looked gt these various building groups you found that they did not all run for the same 
period. The ships of builder “G” apparently were running for about two hundred days short of 
those of builder “F,” and on the whole had a high rate of accident, especially in their early life. 
If the vessels produced by builder “F” were largely put, by chance or by design, on the route, say, 
from New York to India and back, and the vessels produced by builder “G” were, perhaps because 
some of them came on at a later stage, put on the North Atlantic route, in those circumstances 
even if they were all equally well built the effect on the casualties would have been on the lines of 
the experience because one lot would have been shaken about a good deal more than the other. 

Again a aisualty was counted when a vessel had a small defect, or when a vessel had a big 
defect ; the big one would be out of action perhaps for several months, and the other, one would 
very likely be put right while the ship was turning round and would run a chance of having a second 
accident very soon afterwards because she would be in service again. If, for instance, the 64 cases 
of the builder “G” were all casualties for long periods, they might yet have gone back again into 
the experience as merely a casualty and not permanently out. This might distort the rates of 
casualty. He thought therefore that the time when these cases happened, and the routes on which 
ships were employed might be examined to see what could be found from them. One little check 
might have been done, and probably had, namely to work out the monthly accident rate in the 
same way as Table 4 for each builder, and to see if different results were shown for the various 
builders. 

* Mr. Chambers is referring to an earlier printing of Dr. Vajda’s paper, in which the term “accident 
proneness “ was used. See also (he concluding section of Dr. Vajda's reply. 



1947] 


Discussion on Dr, Vajdd^s Paper 


167 


He had heard quite outside the Ministry with which he was at one time connected that there 
were some English owners who might like to buy American ships if they knew the yard in which 
they had been built; this implied a feeling that there^was considerable variation in standard of 
building. But it was possible to jump to wrong conclusions. Towards the end of the 1914-18 
war there had been a certain amount of quick building in American yards, and a number of 
people in this country had been horrified because a lot of those ships, the first time they had 
got across, had had to have repairs, and it was voiced abroad, to a certain extent in the ordinary 
Press, that this must have meant the ships were badly built; and yet a large numlxjr of the 
remaining ships of that group, having been used for a bit and then laid up, were used strenuously 
during the recent war and had done very well. 

Mr. Buck LAND felt it was of great value to have access to these additional examples of the 
application of statistical techniques to problems of operational research, because he believed that 
operational research would have to become a part of any large organization in this rather difficult 
post-war world. As Dr. Vajda had implied, while the particular material under review might 
be of passing interest, the methods were of a more durable kind. 

In his own field, at the London Passenger Transport Board, there seemed to be at least a prima 
facie case for the application of the methods of the paper, arising out of the similarity between 
groups of vessels and groups of vehicles; road vehicles particularly, as rail vehicles were not so 
much in the same class. An example had been touched upon briefly by the Government Actuary : 
if the definition of accidents for road vehicles was extended to mechanical failures resulting either 
in a delay, or taking the vehicle out of service, then it seemed to him that in the paper they had 
quite a number of techniques useful for the analysis of that kind of occurrence. It was a problem 
which was fairly high on his agenda at the moment, so he was grateful for the additional infor- 
mation. 

The data in the paper were from tfie ordinary processes of operating, as distinct from the type 
of information which came from a scries of planned experiments. He felt that was a distinction 
which had to be borne in mind since it did largely govern a great part of the analysis. For example, 
much of the analysis would fall under the heading of “making sense out of figures,” which Dr. 
Yates had recently stated was an activity which still took up a great deal of the statistician’s time. 
Quite early in the section devoted to the analysis of material, Dr. Vajda had stated that the observed 
facts did not agree with his theoretical model, so he had had to cast around for some likely explana- 
tion. This was another extremely important aspect. There were, in fact, some interesting papers 
in the current issue of Diometrika on those lines. 

He would raise two points, one possibly of major interest, one minor. The first referred to 
accident-pronencss and age. It would appear that age was to be taken as some fairly continuous 
function of time; what, then, was the effect of a period spent in dock? He raised this problem 
only because for a road vehicle, such as the Board were in the habit of operating, a period in dock 
did materially change its effective age. In an imaginary case, for example, a vessel -or a 'bus, in 
his own sphere — might become the equivalent of a new completion. Anyone who had journeyed 
around in the “ST” type vehicle that they still had to use on the road would realize how the 
maintenance and periods in dock had prolonged its effective life, if not its comfortable life. 

Mr. R. E. Beard said he had found the paper of very great interest. U linked up rather 
closely with some work he had been doing during the war, and he wished to refer to that problem. 
He had been concerned with accidents to naval aircraft, and one of the problems that had to be 
considered was the action, if any, that should be taken if a particular pilot had more than a certain 
number of accidents in a given period. There were very few statistics to begin with, but a certain 
amount of material was built up, and the pilots were then classified by the number of accidents 
they had suffered in the period of a year. Unfortunately, the period of exposure was not known, 
and, for various reasons, the pilots with no accidents could not be accurately determined ; however, 
details of those who had one, two, three, four, etc., accidents respectively were available. First 
of all the data were studied to see if they showed any obvious features. A Poisson distribution 
was first tried, but did not fit the data ; then a generalized Poisson was tried and that did fit the data. 
The point then was, was there any mathematical model that could be used which would give rise 
to the observed distribution? There were two of consequence, and the fact that there were at least 
two showed the difficulties in building up mathematical models to fit the statistics. The first 
model was that there was a variation among the population of pilots as regards their liability to 
accidents, this being, of course, closely allied to the studies of Greenwood and Yule; the second 
assumption was that the accident proneness increased in arithmetical progression with each 
accident. These two basic assumptions gave rise to the same distribution for the numbers of pilots 
with one, two, etc., accidents. 

Obviously the remedial measures in the two cases would be different. If it was demonstrated 



168 


Discussion on Dr. Vajda^s Paper 


[No. 2. 


that the more accidents experienced the more likely it was that another would be suffered, then 
obvious steps should be taken after a prescribed number of accidents had been suffered. On the 
other hand, if the reason was due to variation among the pilote, alternative tests must be devised 
to discover those pilots with a high proneness before they had incurred any accidents. The data, 
however, did not allow a differentiation between the two models, and the present paper did tackle the 
problem in the same way that was tried, namely by analysis of the intervals between the accidents. 

Turning now to the present paper, the first reaction was to see, with regard to Table 1, 
whether the generalized Poisson did fit these figures. They had already he^rd of the difficulties 
and dangers inherent in the data, so the figures were quoted as a piece of arithmetic ! The results 
were as follows 


Number of 

Actual number 

Expected number 

accidents. 

oj vessels. 

of vessels. 

0 

1918 

1912-5 

1 

373 

387-1 

2 

117 

104-0 

3 

24 

30*1 

4 

11 

9-1 

5 

3 

2-7 

6 

1 

1-5 


The fit was very satisfactory, and led to the suggestion that some progress had been made. Actually 
there was probably very little, because the period during which the different vessels were exposed 
differed so considerably. Some were exposed the whole of three years, and others for a lesser 
period, so that it was necessary to adopt a more exact investigation as developed by the author. 

His next point referred to Table 5. For the whole data the average interval between accidents 
was 222 days, the different builders showing a range from 150 to 278 days However, reference to 
Table IV, which gave the frequency distribution of all the intervals, showed that it was rather a flat 
distribution. If Table 5 were regarded as showing samples from that distribution, a sample 
standard deviation could be worked out and the significance to be attached to the average 
intervals determined. 

Owing to the spread of the data in Table IV, the standard deviation was approximately 1 75 days. 
Bearing this in mind, and having regard to the number of casualties in each group, the statement 
“It is obvious there are very significant differences” did require some clarification. In fact it was 
doubtful if there was more than one significant difference, and if worked out accurately it was 
possible that there would be no significant differences. There were, of course, other considerations, 
from which some differences might be expected to arise, but perhaps Dr. Vajda might comment 
on why the significance was obvious without making some sort of test. 

One other small point was that Dr. Vajda had made the rather happy statement, “It would 
be interesting to apply some periodicity analysis to these data,” but it was surely almost impossible 
to get anything of any use out of the data for three years. There would be, at the most, three 
cycles, and it would seem that little confidence could be attached to the results derived therefrom. 

In conclusion he was afraid he might have sounded rather critical on minor points of the paper, 
but he did appreciate that it was a very valuable piece of work. Had it been before him three 
years before it would undoubtedly have saved him a considerable amount of analysis. He was 
grateful to Dr. Vajda for his paper. 

Professor GRi tNWcx)D said he had been consultant to the Admiralty in medical statistics, but 
did not feel that gave him authority to comment upon the practical aspects of the paper; he would 
like, for mainly sentimental reasons, to refer to one of the theoretical aspects. 

Dr. Vajda had spoken very kindly of the paper which Mr. Yule and he had contributed to the 
Journal twenty-six years ago, and he hoped Dr. Vajda would also read a paper which was contri- 
buted to the Journal seven years later by his colleague, the late Dr. Ethel Newbold. The point 
was that in the earlier paper Mr. Yule and he had dealt with three hypotheses, and they had dealt 
with them necessarily rather superficially: the first hypothesis was that the multiple accident 
distribution was a chance distribution ; the second that first accidents were distributed by chance, 
but the probability that a person who had had r accidents (r > 1) would have r + 1 accidents 
was not the same as the probability that a person with r — 1 would have r accidents ; the third, 
that proriencss, or susceptibility, varied from person to person. To the comparatively small 
section of the public which would be likely to read any paper having his name on it, the third 
possibility was very much the most exciting, and as the negative binomial gave a nice fit to many 
data and was eavSy to use, the hypothesis attracted a lot of attention. But it did seem iniportant 



1947] Discussion on Dr, Vajda's Paper 169 

that the algebraical basis should be strengthened, and that was the main point of Dr. Newbold’s 
work. 

At that time he was an enthusiastic disciple of the methods of Professor Chuprov, and Miss 
Newbold was also a disciple, which was the reason why the algebraical treatment was in Chuprov’s 
notation. No doubt younger and better mathematicians could reach the results rather more 
expeditiously, because Chuprov’s elementary methods took a little time. 

The first point he desired to make was that the application of the negative binomial to the 
elucidation of accident proneness could only lead to valid conclusions if the data were homogeneous ; 
one must not mix together records of different experiences, for instance of persons engaged on 
different processes of manufacture which entailed different risks. 

His next point, perhaps only another way of stating the first point, was that if the column 
totals of such a Table as V on p. 1 58 were better fitted by a negative binomial than by a Poisson — 
and, as Dr. Beard had said, they were better fitted— it did not follow that accident proneness, in 
the sense of those who had used the Green wood- Yule method in industrial research, had anything 
to do with the result. Tt was well known that the sum of a number of variables each obeying the 
Poisson “law” was itself a Poisson variable, but it was not true that if we took a number of Poisson 
frequency distributions and summed the O's, I’s, 2’s, etc., the resultant totals of O's, Ts, 2's, 
etc., would make a Poisson distribution; in fact it was easy to sec that the variance of the summed 
data must exceed the weighted mean of the Poisson parameters of the set by the weighted mean 
of the squared differences between each Poisson parameter and the weighted mean of all of them. 
This fact did, however, suggest a possible test. If his arithmetic were correct, the mean of Table V 
was 0*304 and the variance 0*457. Suppose that each horizontal array were a Poisson, what out^ht 
the variance to be? One can only take the means of the arrays as approximations to the para- 
meters; doing this he made the “expected” variance 0*362 — a good deal larger than 0*304, but a 
good deal smaller than 0*457, and, he thought, “significantly” smaller. This again was one of 
the results which could be reached by common sense, still, the method might sometimes be useful. 
He congratulated Dr. Vajda on completing so elaborate an investigation; he himself, like the 
author, was interested in the theory and not ashamed of that interest. 

Dr. Heron wished to join those who had already spoken in thanking the author of the paper 
for a very interesting piece of work. He welcomed the author’s explanation that the material 
was really homogeneous, but he would like him to go a little further than that, because it was 
rather important to show that it really was homogeneous ; he thought he might have used some of 
the constructional data on the Hollerith card to indicate if the material was homogeneous, such 
as tonnage or speed. It was not necessary to give the actual figures, because index numbers 
could quite well be used and would be innocuous. 

Secondly, he wished to associate himself with what had been said about the treatment of the 
duration of risk, because this seemed a very important factor in the analysis of accident frequencies. 
In the first Table they would see that one vessel had, in the three years, six casualties, and three 
had five casualties. In the three years those four vessels must have spent some eighteen months 
in dock, under repair, and yet it was assumed in the paper that they were on service during the 
whole three years; he thought that was a serious defect. The data must, of course, be tucked 
away in some water-tight compartment of the Admiralty, and he had no doubt they could be 
obtained, and if so they should be used. 

There were two other points which might be considered ; those were the effects of deferred 
repairs and the records of the. masters of the ships concerned. It was common practice for a 
vessel which had sustained a casualty to be patched up at the nearest available port and then to 
be sent to the most convenient yard for more complete repairs. Further, in wartime, and also 
when freights were exceptionally high, there was a marked tendency to execute only such repairs 
as were necessary to make the vessel seaworthy, and to leave complete re-fit to a more convenient 
time. While running after temporary repairs the vessel was definitely a worse risk that it was 
before the casualty occurred, and this could hardly fail to affect the result of an investigation 
such as this. 

Finally, just as it was well known that some drivers of motor-cars and some workers handling 
dangerous machinery were what is sometimes termed “unlucky,” that is they were more likely to 
have accidents, so some masters or some commanding officers in Naval vessels might be unlucky 
in the same way, and their ships might be more likelt to suffer casualties. He thought, therefore, 
the records of the masters of the ships should be investigated at the same time. 

Dr. SuTTpN said that the paper was epitomized in paragraph 3, where it was said that the 
tables were rather disappointing because they allowed of conclusions almost by inspection and 



170 


Discussion on Dr, Vajdd*s Paper 


[No. 2, 


left little scope for analysis, but that the paper being concerned more with methods than results, 
a few possible ways of extracting information would be mentioned for their possible value in 
future similar investigations. Nevertheless, some comments must be made on the enquiry as 
presented, firstly in relation to Dr. Vajda’s suggestion that it might bear upon future similar 
investigations and, secondly, in relation to the Admiralty’s possible use of the conclusions. For 
such purposes the information seemed to be inappropriate to the three problems stated, and it 
might be useful to suggest in what ways. 

First, however, there must be some question as to the types of vessel involved, and in particular 
whether they were wholly Naval vessels. They might, of course, have included Ministry-owned 
vessels, but the large numbers completed at least suggested many small ones such as Naval 
types. If they were wholly Naval vessels then some of the following comrnents were not 
appropriate, though they might still be so in relation to other possible enquiries. The more 
elaborate the technique of enquiry the more must the information be appropriate to it. 

The main deficiency was the definition of a casualty, which was clearly stated to be outside the 
duties of the Statistical Branch. The information required here was not only whether a succession 
of mishaps was a single casualty, but what type of mishap was called a casualty, what limits in 
amount were taken for decision and whether unrepaired casualties were included. Many vessels 
went unrepaired for long periods, sometimes unreported. 

The types of casualty came under several distinct headings. 

Firstly there were war casualties of the kind covered by war risk insurance policies, which he 
understood had been excluded. It would be interesting, however, to know on what basis war 
risk was defined, since this was a very subtle decision in maritime law and the dividing line could 
aftbet results materially. 

Secondly there were collisions. Convoy collisions between commercial vessels were not war 
risks, though most people would take them to be. On the other hand, many kinds of collision 
could by technical decision be construed as war risks, save that Marine Insurance policies had 
reinstated them as Marine. During war years, such as those under review, the risk of collision 
was obviously made greater by very many other causes than convoys, including the dimming 
of both ship and shore lights, the congestion of anchorages and so on. 

Thirdly were groundings, which again were made more frequent in such conditions, and 
particularly during the use of unusual, damaged and neglected harbours and berths. 

Fourthly there were fires, which would be increased by various causes, including the relaxation 
of regulations and precautions, though not in Naval vessels. Sabotage was peculiarly suggestive 
with fires, though particularly difficult to prove to the satisfaction of war risk insurers. 

Fifthly there was heavy weather, the risk of which could be materially affected by deeper 
loadings and by the necessity for sailing in all sorts of conditions, where normally vessels 
would either case up or not proceed at all or be differently loaded. 

Sixthly was damage by ice, the risk of which in ordinary times was very carefully watched. 

Seventhly there were breakdowns, whether of steering or machinery, the latter of which could 
be cither of the main engines or of auxiliaries. In this, of course, the type of main engine would 
have considerable bearing. 

In relation to all of these the particular service of the vessels should be considered — that was, 
whether they were predominantly in such areas as the North Atlantic or Arctic, the Mediterranean 
or in specially protected waters. Consideration would have to be given also to whether their 
service was otherwise particularly intense, involving fatigue of crew, the taking of risks and pro- 
ceeding frequently at excessive sp)eeds. Many vessels also were not allowed to repair properly 
during this period, and so were often not as well fitted to cope with conditions. 

Considerations of different kinds are with regard to the personnel of the vessels, both Engine 
and Deck, and how far negligence or variation of any standard of efficiency contributed to accidents. 
Of a diflerent kind was the question of the ordered design and equipment of vessels, and even of 
management, if different managements should be involved. 

Anyone having practical experience would readily realize that in the great majority of such 
casualties there could be no direct question of builders and age of vessel at all. The problem of 
first and subsequent accidents might have something to reveal, though in practice first casualties 
seemed to have no effect on later ones. The diftcrent types of casualty might be broadly common 
within the groups of vessels according to builders and not greatly disturb a pronounced trend, 
but no one would make such a suggestion boldly. The peaks in winter were readily acceptable 
for nearly all kinds of casualty. 

The more important point, however, was that the effects of bad build and of age were ordinarily 
experienced indirectly rather than directly in casualties. The direct casualty, as for instance 
breaking in two, was the exception. The most usual types were an increased liability to heavy 
weather damage and to machinery breakdown, though with the latter the engine builders might 



1947] 


Discussion on Dr, Va}dd*s Paper 


171 


differ from the hull builders. But bad build in particular, and age eventually, showed up in three 
very nriaterial ways: first by constant, small, troublesome occurrences that did not rank as 
casualties, but were cumulatively nearly as bad ; second, by the increased cost of repairing actual 
casualties; and thirdly by the much greater expense of survey upkeep. 

The amount of damage as also the delay involved in the definition of casualties might be 
important, since although it was clear that large amounts would not at all closely represent signifi- 
cance in the sense of the enquiry, it was equally obvious that if all damage in excess of some very 
small amount was included as a casualty the significance of the results would be greatly affected. 
Here, of course, the number of casualties suggested that some moderately high standard had been 
taken. 

The period used for analysis was also a relatively short one. 

The usefulness of the mathematical technique in itself was quite a different matter, but it would 
certainly be of interest to know how far Dr. Vajda has already felt any of the foregoing qualifica- 
tions to be desirable. He was sure the Society would agree that it was our duty to make clear 
both the practical use and the limitations of any methods we considered, even though it was well 
recognized that mathematical methods were often developed in advance of their practical appli- 
cation. 

True knowledge transcends personal ideas, and he would, of course, be as much interested in 
Dr. Vajda's correction of the foregoing as in his confirmation, but if he accepted any material 
part of it as relevant to satisfactory conclusions, what significance did he attach to the pronounced 
variations between builders on the alternative possibilities that the irrelevant effects did and did 
not broadly average out over the comparatively large number of vessels under review? And 
what further significance might there be with inclusion of the important effects not shown by direct 
casualties? Would the Hollerith installation be competent to deal with such fuller variations, 
or in other words, to what degree of detail did he consider the installation appropriate? What 
additional complications in mathematical technique would there be to provide the valuable 
suggestion and the all-important check upon prima facie practical conclusions so often possible 
by mathematics? 

As a more particular comment, why was it that casualties dropped so rapidly from June, 1944, 
onwards, in broad line with the decreasing numbers of vessels completed, but not with the continued 
increase in total vessels? It suggested a change of casualty conditions quite opposite to what 
should be an increasing effect of age and building defects. 

We had, of course, to realize the restrictions on the use of data under which the author had 
worked, and must thank him for a bold attempt at promoting interest in a new field. 

« 

Dr. Soi OMON said his first reaction on reading the paper had been one of unqualified admira- 
tion, and his second, which he had hastily tried to overcome, was one of curiosity concerning the 
material involved. He had tried to suppress that reaction because this was essentially a paper on 
statistical technique, and he wanted to differ from most other speakers by discussing that aspect 
alone. 

The first point he wished to make referred to a general remark in Section 3 («), which gave an 
analysis based on the work of Mr. D. R. Cox. He felt moved to say, with reference to a group 
who were rather under-rated by the general statistical community, that the method was precisely 
equivalent to the conventional actuarial treatment, and had been extended by many actuarial 
students using a different nomenclature. 

His next point was one of detail. In the section on the Analysis of Variance the difficulty 
appeared to be to obtain a valid estimate of the random variation against which the several hypo- 
theses could be tested. The difficulty in finding such an estimate was not uncommon in contin- 
gency table analysis, and he envied the happy agricultural statisticians who never had any such 
difficulty, but always found some interaction or other which was physically acceptable as an 
estimate of random variation. Now in this ca.se the variance of the estimate of an accident rate 
was, nearly enough, proportional to q itself. The hypotheses tested were of the type “there is 
no effect other than so-and-so.” That implied that allowance was being made for the possible 
existence of two or more effects — that is of two or more population values of q, the accident rate. 
Therefore the postulate of the Paper of a single variance depending only on the exposures appeared 
at first sight to be unacceptable. The difficulty could of course be overcome by the standard 
methods of using not the accident rates themselves but their square roots, or the inverse sines of 
their square roots. 

One possibility he might mention, although he was not at all sure of its applicability. He had 
had occasion recently to analyse some actuarial data, and in terms of this paper had tried to find 
an estimate of the random variation by a regression analysis with respect to age of first accidents 
of ships from a single builder. That completed, tests were carried out to ascertain whether the 



172 


Discussion on Dr, Vajdd's Paper 


[No. 2. 


estimates were sensibly uniform for all builders. With regard to the accident rates of ships, 
however, there were difficulties, as other speakers had shown, in finding any simple formul^i which 
expressed the dependence on age. The total life of a ship was, perhaps, twenty-five years, and the 
ships dealt with in the paper had been examined over a period of only three years. It might be, 
as others had said, that the effect was not an ageing of the ships, but was associated in some way 
with the skill in handling them of their masters, as they became progressively more familiar with 
their vessels. 

In conclusion he wished to repeat that although he had learned very little about the accident 
rates of any class of vessel he had learned a great deal, for which he was grateful, about statistical 
technique. 

The following comments were received in writing : 

Mr. BabinciTON Smith : Like one of the speakers in the discussion I mistrust the comment 
below Table 5, “It is obvious that there are very significant differences between builders.” 

Since the meeting I have separated the variance “between” and “within” builders in Table T 
and find a ratio 31 for 6 and 737 degrees of freedom. In samples from a normal universe P 
for such a value lies between 1 and 01 per cent. It is by no means clear to me that, with distri- 
butions such as are found in this paper, the usual tests of significance can be applied and interpreted 
at their face value. For Table II I find -- about 6 for 6 and 522 degrees of freedom, but does 
such a value do more than confirm the view that we are dealing with abnormal distributions? 

My second observation concerns points raised in the discussion. I noted that some speakers 
drew attention to reasons why one ship should be more liable to casualty than another, such as 
the efficiency of the master and crew or the more hazardous nature of some seas. Only one 
speaker, as 1 recall it, put such a point in a form directly relevant to Dr. Vajda’s findings, by 
suggesting that the ships from different yards might tend to come into service at different times of 
year and so become exposed to risk at times of gi eater or lesser hazard. 

Unless a cause of variation in risk is in some way associated with the builders (i.e., unless ships 
from certain yards attract a better type of master or were allotted disproportionately often to 
dangerous runs) its direct effect on mean casualty rate per builder may be expected to be negligible. 
On the other hand, since undoubtedly such causes produce difierences in casualties between one 
ship and another, they contribute to the general variance, and if their effect could be estimated, 
the differences between one builder and another might well stand out more clearly. 

I should be glad to know if Dr. Vajda agrees with this view —but I should still be grateful if he 
would indicate more clearly why he considers it obvious that the differences between builders 
are significant. 

Mr. Quenouille: I should like to make one or two comments on the problem raised by Dr. 
Vajda at the end of the first section of his paper. Dr. Vajda regretted the lack of a method for 
testing periodicity when different weights are attached to the observations. It seems to me that, 
in this case, the number of observations is so small that we should probably have to content 
ourselves with testing a multiple-classification table with unequal numbers in the different classes 
as described by Dr. Yates. However, since the problem appears to be frequently encountered 
when it is necessary to test percentages and other weighted observations for oscillatory movements, 
1 believe that some remarks might be helpful. 

Suppose that the observations Xi . . . Xn have weights Wi . , . w„ and that the variance 

of w, Xi is Suppose also that the weights arise in such a manner that the estimate ^LwiXif'^wi 

of the mean is unbiased. It is not difficult to see that these suppositions will not hold if the 
weights are not representative. For example, if we were observing the accident rate on enemy 
vessels, our probability of observations would be greatest in summer when the accident rate 
would be lowest, so that the weighted mean would be biased downward. The detection of such 
a bias assumes some a priori knowledge of what constitutes a representative set of weights, e.g., 
equal ship-days exposed in each month, and this knowledge must form the basis for an unbiased 
estimate of the mean. 

It should be noted that in addition to correcting the observations Xi for the mean, we may also 
wish to take out a regression on wi. By this means we can investigate whether any oscillatory 
movement in the accident rate arises solely from a similar oscillatory movement in the number 
of ves.sels operating. 

Suppose that we wish to test for periodicity, and that 

Xi = - X ), S X, cos and B J- S Xi sin 

i-H', X liWi X 



173 


1947] 


Discussion on Dr. Vajda^s Paper 


Then, for large w, if . 


and 


. . Xn are random elements from' a normal population, 

var Xi Wia^ 


cov (Xi Xj) 
var A 

var B 
cov (A, B) 


0 

Swi 

2a* 

llWi 

2a* 


. 2a* « 47rf 

i^Wif X 


2a* 
Shv sin 


j, Swf cos 

4ni 

X 


47r/ 

"X 


Thus, provided ^wi sin and cos j'Lwi arc small, /.c., provided does not have 

a term of period X/2, we can apply the usual tests of significance to the intensity, A^ -f B\ to 
determine whether the weighted observations have a period X, provided that we use ilvvj instead 
of n. 

We can, in the same way, use corrclogram analysis with estimated correlation coefficients, 
rj— '^XjXi \ j/'LXi^ to determine any oscillatory movement in the data. Dr. Bartlett’s formulae 
and analysis can be extended to this case. Thus, if L \/ Wi Wi+j then 

(n — .V ^ 

cov ~ ^ ^ s 

yrg rVs-[^l t=- - 00 

and, if the weights are randomly distributed, we get 

OO 

cov (/•„ r, k) ~ _ 2:^ Pi H i < 

so that Dr. Bartlett’s analysis can be applied directly. 

As a simpler alternative to the methods described above, we could use a standardized variable 
Yi \/wi (Xi “ Jc) for analysis. This variable has constant variance, a^ so that the usual tests 
of significance can be employed, although these will be less efficient than the above methods. 

When we arc dealing with percentages, p, the use of the transformation x sin p, tabulated 
by Fisher and Yates, is to be recommended, so that the variance of x is independent of p. Under 
this transformation, wi becomes the number of observations upon which the percentage is based, 
while a2 This determination of a- allows the Schuster and Walker tests for the periodogram 

to be used. Again, when we are dealing with index numbers, /, the transformation x log 7 is 
often very useful. 

Finally, it should be noted that while the above tests arc extensions of existing tests, we shall, 
in general, require a larger number of observations if practice is to correspond to theory. 

Mr. D. R. Cox : Although J was unable to be present at the meeting, I was privileged to receive 
a copy of Dr. Vajda’s paper. I was surprised to find that the equation in Section 3a was attributed 
to me, since it has b^n used in many investigations of randomly occurring events — there was 
certainly no intention of claiming it as original in the R.A.F. Report referred to. 

Dr. Vajda has applied the equation directly to data for which the detailed distribution of 
intervals is known. In many cases these figures will not be available, and it may be of interest 
to indicate how the interval method can be extended to deal (by standard continuous distributions) 
with problems that arc usually tackled by the use of the often less convenient discrete Poisson 
distribution. Details are to be found in the R.A.E. Report which Dr. Vajda has mentioned. 

Suppose that n events occur in a period T and that it is permissible to assume that events 
occur randomly in time at the unknown rate m. If the period T was measured up to the event, 
then the result in section 2a of Dr. Vajda’s paper can be used to prove the known result that, for 
fixed /I, 2w7’is distributed as x* with 2n degrees of freedom. 

In most cases it will only be known that the event occurred before T and the (n \- ly* 
event occurred (or would have occurred) after T. It is, however, often permissible to assume that 
the choice of the end of the interval T is equivalent to the choice of a point at random between 
the and the (n -1- ly* events; in other words, that the ratio 

Time from event to T 

Time betw^n and (n 4- events 



174 Discussion on Dr, Vajda's Paper [No. 2, 

has a rectangular distribution over (0, 1). Fisher* has, in another connection, suggested a similar 
device. 

Let Ti be the time up to the n^^ event and let T = Ti f- 
Let X - 2mT, Xi - 2mTi 0 -1, 2). 

The distribution of Xi (for fixed n) is the x* distribution with 2n degrees of freedom, and it is 
easy to show that the frequency function of X 2 is, under the above assumptions, 


CO 



and that X^ is independent of A"i. 

The distribution of A' ^ Xi ^ X 2 can now be found. For large /; it will be very close to the 
X* distribution with (2// 1 1 ) degrees of freedom, and in fact the calculation of cumulants, and 
computations on the exact frequency function, show that this approximation is very good, even 
for n as small as one. 

Thus under the above assumptions, X ^ 2mT is distributed (for n fixed) as x“ with {2n f 1 ) 
degrees of freedom. 

To apply this result, suppose that m events have occurred in periods Ti (/ - 1, . . . , A:), 

that the above assumptions can be made, and that it is required to test the hypothesis 


w, - - ////•. 

Then Tt/(2wi j I ) has (for fixed //<) the distribution of an estimate of variance based on {In, }- 1) 
degrees of freedom, and so Bartlett’s test may be applied. 

If k 2, 

r, (2//, i \)T, 

^ (2A/;-i"i)'n 

can be tested in the variance ratio distribution with (2//i -f 1, 2//^ -f- D degrees of freedom. 

The second of these tests is probably more convenient arithmetically than the usual test for the 
equality of Poisson means, while both the tests given here may be considered as accurate, even 
for very small samples {n, .* 1), 

I'he control of errors of the lirsl kind in the test in which the //, are regarded as fixed and the 
T, as variable is not the same as the control in the Poisson test, in which the Tt are fixed and the 
n, variable. However, in many applications the two controls will be equally acceptable. 

Dr. Vajda has used the method of intervals in his analysis of ship accident rates; work at the 
Wool Industries Research Association has suggested another problem for which the method 
might be used- that of “ends down” in spinning' Each operative is responsible for one or more 
frames each containing a hundred or so bobbins, and every time an end breaks (“end down”), it 
has to be repaired. Here, even if the hypothesis of randomly occurring events has to be discarded, 
the distribution of intervals between successive “ends down” and the joint distribution of adjacent 
intervals remain of direct interest, since these distributions can be related to the optimum number 
of frames per operative. 

In reply, Dr. Vajda thanked the audience for their very kind reception of his paper. He 
would reply in detail in writing; in fact he must do so because he had learned so much from 
their contributions during the last hour that he could not possibly be expected to piece it together 
in such a short time as was available. 

He would say one word about the question of the vessels being homogeneous; in fact they 
formed practically one single class and were all of the same tonnage, for what that information 
was worth, and he thought there could be no possible doubt that for that statistical investigation 
all the vessels could be considered homogeneous as far as their obvious aspects were concerned. 

The second point, raised by Mr. Maddex, which he had appreciated from the beginning and 
had mentioned it in the paper, was the question of deductions to be made for days spent in dock, 
etc. This was a difficult point, and he could have avoided it to a certain extent simply by saying 
that he had not had the information. Their American friends had made a similar investigation. 
They had had to have the log books of all the vessels concerned, and when one of the Lieutenant- 
Commanders of the United States Navy was over in this country and had said how many Wrens 
— or the American equivalent — he had sent for how many weeks to a certain place in the United 

* J,R,SS, 98 , 39 . 



1947 ] Discussion on Dr, Vajda*s Paper 175 

States in order to enable them to find out what happened, and how many days these ships had 
spent in dock and harbour, he had been quite sure that he was justified in forgetting about this 
altogether. They would not have had the man- or woman-power to do it. He was quite clear 
about the fact now that he had, on the other hand, caused a very great difficulty by admitting that 
he had not done anything about it ; he offered his apologies for this. There was the further diffi- 
culty that the casualties were not such that they could not have happened in dock, harbour, etc. 
Many of them did, in fact, happen while the vessel concerned was in harbour, but he felt he must 
leave it to them to decide whether that was really a good excuse for not tackling the problem. 

He was very grateful for the many suggestions which had been made for the way in which the 
investigation should be carried on. As far as Mr. Chambers’ suggestion regarding seasonal 
variations was concerned, they had started investigations on those lines, hut could not finish them 
in time. He hoped to say something about that in writing. Some of the suggestions had already 
been considered by the Admiralty ; some had not, but would certainly be considered from now on. 

Dr. Vajda subsequently wrote as follows: 

I do not think that I can add much concerning the time spent in dock and harbour or the type 
of casualty and of vessels, except to say that I was glad experts referred forcefully to the difficulties 
of definition and interpretation inherent in such investigations in general and of shipping in par- 
ticular. Personally, I feel that the efficiency of masters and crew in the vessels investigated was 
sufficiently high and sufficiently consistent to justify me in ignoring the possible influence of these 
factors. 

A few speakers mentioned further possible tests which might have been applied. Thus Mr. 
Maddex wondered whether the comparison of accident rates by applying standardized weights was 
carried out. We did experiment with this method, and the rates obtained by taking the total for 
all builders as standard was within .1 of the rates given in Tables I and 2, except for Builder G, 
where 8.3 was obtained for ’’All Casulaties" and 8.4 for ’’First Casualties" (in lieu of 9.1 and 
9.5 in the tables). In any case, this test gives only an idea of the relative accident rates excluding 
the elTcct of age, whereas we were also interested in the age effect itself. 

Sir William Elderlon suggested a periodicity analysis for individual builders. The time of 
exposure was very short (as Mr. Beard has also pointed out), and further subdivision would have 
made the data rather thin. But with further material the comparison mentioned by Sir William 
must not be overlooked. 

The question of seasonal variations gives rise to further tricky problems. 1 appreciate Mr. 
Oucnouille’s hints, but his qualifications concerning an unbiased mean appear to be serious. The 
decrease in accident rates in June, 1944, and after, noticed by Dr. Sutton, seems to fit quite well 
into the general trend experienced in the earlier years. The objection raised by the Government 
Actuary and by Mr. Chambers is, of course, a very relevant one. In order to see whether the age 
effect is genuine or only a spurious product of the seasonal variations we have introduced a new 
variable, namely, the month. We have constructed a further contingency table, showing the 
accident rates according to month and age, and have applied the technique mentioned in 
Section 3(^) of my paper, but the exclusion of the effect of the seasons did not make the age effect 
disappear. Mr. Babington-Smith’s question, whether the usual tests of significance are applicable 
to distributions which appear to be rather unusual, is worth following up, but 1 feel that if effects 
arc very significant by the usual tests, then we are justified in thinking that the effects arc present, 
if even to a smaller extent than appears at first sight. . 

I agree with Mr. Beard that a Poisson fit to Table 1 may not mean so much as it seems to at 
first glance, because of the uneven distribution of exposure time, and 1 am grateful to Professor 
Greenwood for his illuminating analysis, supplemented by figures. 

1 ought not to have used the word "significant" after Table 5, because 1 did not mean it in its 
technical sense. As I have pointed out, this test of intervals cannot be used for far-reaching con- 
clusions in any case. However, Mr. Beard’s approach in statistical terms is, of course, sound. 
Dr. Solomon is quite right in saying that it would have been advantageous to apply a square-root 
transformation to the accident rates. I have mentioned this point myself in an earlier paper, and 
I am glad he brought it up again. 

A few smaller points remain for comment. Dr. Sutton’s question whether I considered a 
Hollerith installation capable of dealing with more detail would have to be investigated in every 
case. ‘ 

Some "Exits" were due to casualties, but J could not find any trace of this fact affecting the 
remainder. Still, with more extensive data this hint given by the Government Actuary should be 
followed up, 

I have accepted Mr. Chambers’s plea for the use of the word "liability" rather than "proneness" 
and the paper as finally printed does not contain this word. 



176 


Bartletf — Multivariate Analysis 


[No. 2 


Multivariate Analysis 
By M. S. Bartlett 

[Read b<?fore the Research Section of the Royal Statistical Sckiety, Thursday, 

May 29, 1947, Dr. J. Wishart in the Chair.) 

Contents 

1. Preamble. * 

2. Multivariate analysis of variance. 

3. Canonical reduction of the general regression problem. 

4. Theory of discriminant functions. 

5. Discussion of an example from anthropometry. 

6. The general sampling distribution of the canonical roots. 

7. Further notes. 

Bibliography. 

1. Preamble. 

The advances in recent years in the theory of multivariate statistical analysis seemed to me 
to merit a paper on ihis topic to the Research Section of our Society, and this suggestion was 
provisionally approved by the Committee in the early summer of 1946. I mention this date, 
because now that 1 come to write this paper other expositions in the meantime have made me 
more doubtful of its necessity. I am thinking particularly of the chapter on multivariate analysis 
in Volume II of KendalPs treatise ( 33 ) (an interesting note on the main lines of development in 
the United States, India and England, a.ssociated respectively with the names of Hotelling, Mahala- 
nobis and Fisher, will be found at the end of Kendall’s chapter), and of papers recently presented 
in the United States, for example, by Tukey (‘‘Vector methods in analysis of variance,” at Prince- 
ton ( 51 )) and by Brown (“Discriminant functions,” at Boston ( 16 )). However, these expositions 
allow me to present here an order of development of methods of multivariate analysis that is 
more closely associated with personal researches, without my feeling bound to attempt an ex- 
haustive general exposition. Perhaps 1 should add that while I have avoided any complicated 
analytical discussion of theoretical problems, I have not hesitated on occasion to refer to the 
mathematical theory, with the aid of matrix and vector algebra or associated geometrical repre- 
sentation. Those who prefer a more elementary treatment should also consult Brown’s paper, 
which I assume will be published ; or familiarize themselves simply with the numerical procedures 
illustrated in the various examples to be found here and in the literature. 

2. Multivariate analysis of variance. 

In biological, psychological or anthropological work it often happens that we are interested in 
several variates simultaneously; we might wish to consider several characters of a plant, or 
consider the differences between races indicated by several skull measurements. When con- 
sidering the statistical analysis of such a group of correlated variates we shall find it convenient 
to adopt the terminology of the theory of regression, and refer to these variates, which we shall 
analyse in relation to other variates or classifications, as correlated dependent variates. We 
shall also find it useful to make use of vector and matrix notation, a sample of n observations 
each in p variates defining automatically a matrix array S of n columns by p rows. The reader 
who is unfamiliar with matrix theory is urged to consult any standard work (e.g. Determinants and 
Matrices^ by A. C. Aitken). In matrix notation the analysis of variance and covariance of a 
number of correlated variates is very succinctly expressed (see ( 3 )). For example, if the sample S 
is split up into two components which are mutually orthogonal. 


S^X^Y, . 


( 1 ) 



Bartlett — Multivariate Analysis 177 

say, we have the corresponding analysis of sums of squares and products (analysis of variance 
and covariance): 

SS' - (JT -h Y)ix yy - XX' + YY', .... (2) 

^here X' is the transpose of X and the terms YX' and XY' vanish owing to the orthogonality of 
the components X and Y. Although in many analyses more than one classification is present, 
the above analysis is still sufficient, provided 5 is a “reduced sample" with irrelevant components 
already eliminated, with a consequent loss of degrees of freedom. To illustrate from the paper, 
just quoted, we are in Table I below, which refers to an analysis of variance and covariance of 
the effect of fertilizers on the grain and straw yields of cereals, concerned with the first three rows 
of figures, the effect of blocks being eliminated. The last three entries m each line of the Table 
correspond with a single matrix term in equation (2), the treatments line corresponding to XX' 
and the residual line to YY'. 



Tablf I. -(Xi 

JK'UrooH of rrrcihnn. 

— Straw, x.» 

- Grain) 

Ai 1,. 

r/. 

Treatments . 

7 

12,496-8 

6,786 6 

32,985-0 

Residual 

49 

136,972-6 

58,549-0 

71,496 1 

Total 

56 

149,469-4 

51,762-4 

104,481 1 

Blocks 

7 

86,045-8 

56,073-6 

75,841-5 

Grand total 

63 

235,515-2 

107,8.36 0 

180,322 6 


1 do not personally believe in “portmanteau tests" when more elementary ones will do, but 
in cases where it is useful to make an over-all test of the effect of treatments (or other relevant 
classification) a test based on the general criterion 

A - I rr' I / 1 55' I , (3) 

where X is the component to be tested, was proposed (this being equivalent to the appropriate 
likelihood criterion in the special case of testing the differences in means among k multivariate 
samples, considered by Wilks (53) and Pearson and Wilks (40)). 

In the above example we have 

A 136,972-6 58,549-0 / 149,469-4 51,762-4, 

58,549-0 71,496-1 / 51,762-4 104,481-1 

0-4920. 

A general approximate test of A has been given (4). We calculate 

^ [n i(p i q t I)} I0&.A 

where n is the number of degrees of freedom of 5, q the number of degrees of freedom of X, and 
p the number of variates. This gives 

. y: - 51 log^O-4920 36-2 


with pq -- 14 degrees of freedom, or a significance level from the tables of almost exactly P - 0 - (X)l . 
Actually, in the case p or ^ = 2, the distribution of VA is known from work by Wilks, and it is 
possible to make an exact test based on the tables of Fisher’s z. This was done in my original 
discussion (3) ; the exact significance level is obtained by calculating 


^ = i log* 


j L ' ^ ~ 9S 

\“VA 


n 

I 


0-536, 


SUPP. VOL, IX. NO. 2. 


N 



178 


Bartlett — Multivariate Analysis 


[No. . 

with degrees of freedom 14, w, - l{n - ^ - 1) = 96. Wc find, interpolating in the 

P -- p ool significance levels of z a value r, (P = 0 001) - 0-531, again giving a significance 
level for the observed z of almost exactly P — 0 *001. Of course, one would not usually use an 
approximate test when a fairly convenient exact test is available, but I have purposely done so 
here to stress the value of the approximate x‘ lest, which is available in general. Its application 
to other examples is illustrated later. 

3. Canonical reduction of the general regression problem 

It is well known that the splitting up of a sample in analysis of variance can be regarded as 
the regression analysis of the sample in terms of other variates or pseudo-variates, and, as mentioned 
in the previous section, it is convenient to discuss the analysis in terms of regression. We therefore 
change the notation of equations (1) and (2), and write 5*2 for 5, for the set of “independent 
variates” that we are taking the regression on, and 52. i for the “residual component” of ^2 after 
the regression on Si has been removed. For equation (1) we now have (cf. (4)), 

^2 P 21 5i H 52.1, . • • • • [4) 


where the matrix of regression coefficients is 


P21 - GiCii-*, ( 5 ) 

where C\i - 5a5i', Cn SiSi\ C22 S^Si and C22.1 - 52. 1 52^; and for equation (2) we 
have 

6^22 ^ 21 1“* Ci2 '} 6^22.1 . • . • (6) 


The analysis of a new dependent variate obtained by a linear transformation of the original set, 
a' 52, say, is given by 

a'52- a'P2i5i f a'52.i (7) 


whence 


a C22 a^’aCaiCii' '(Tiaaj'a a , . . . (8) 


Equation (8) represents an analysis of variance of the variate a' 52; and the ratio of the first to 
the second sum of squares on the right-hand side of (8) measures the significance of the dependence 
of a' 52 on 5i. To find the particular variate a' 52 which maximizes this ratio we choose the 
coefficients a as a solution of the equation 


(C21 Cl I ^ C12 C22) a — 0, . . . (9) 

where 

1 C2lCii“'Ci2 -P2^22| 0 (10) 

Multiplying (9) on the left by a', we have 


- 


a C21 Cii ^ C12 a 
a C22 a 


( 11 ) 


so that is the fraction of the total sum of squares that we have been maximizing, and must be 
the largest root of (10). It is also the square of the multiple correlation between a' 52 and 5i. 
The relation of a' 52 to 5i, while maximized, docs not in general exhaust the real dependence of 
52 on 5j, and if the variate a' 5a is removed, we may still find a significant relation from the 
remaining variates (chosen to be uncorrelated with a' S^, In fact, a repeated maximization for 
such a reduced sample would be found to lead to another root, the second largest, of the original 
equation (10); and so on. Hotelling (28), to whom the above type of analysis is due, has shown 
that the original set 5a can be transformed into an equivalent mutually uncorrelated set A 5a, 
say, each variate of which is associated with one of the roots /?* in (10) ; such a system of variates 
with their corresponding partners from transformations of 5 1 are called canonical variates, and 



1947] 


Bartlett — Multivariate Analysis 


179 


the corresponding roots /?*, or rather their square roots /?, canonical correlations, I shall for 
convenience refer to such an analysis as a canonical analysis. 

To relate these roots with the over-all test A defined in equation (3), we may note that equation 
(10) may be written 

1 1 - C„ -- (1 -- I - 0, . (12) 

whence 

fl (1 -- m) = I 1 - C,« I 

1 - 1 

I ^*22 Cl2 I 

■ IC22I 

- A (13) 

This relation is sometimes useful if we wish to test the significance of the remaining roots 
-f 1 . . . when the first r roots say, which are established to be associated 

with real relationships between 52 and 5i, have been removed. We have 

A = A'A" = H (I ~ /?i*) n (I - /?/), 

i = I j - f + 1 

where the test approximation for A may be extended to M' (see (4)), giving 

t - “ in - Up i q 1)} loge A" .... (14) 

with (p — r)iq — r) degrees of freedom. 

As a simple illustration, let us consider again the example in section 2. We calculate the 
determinant in equation (9) for suitable values of /?-, the matrix Cn Cn-' C12 being given by 
the first row of Table T and C22 by the third row. Here the equation is only a quadratic in /?*, 
but this method still seems the most convenient computational method to find the roots, which 
are calculated by interpolation (for a more complicated example see section 5 ; cf. also Fisher (23), 
Ex. 46.2). We obtain 

- 0 -47698, /?2* - 0 -05934, 

and as a check, 

A = (I - /?i*)(l /?2*) 0-4920, 

agreeing with the value given in section 2. The x“ approximation noted above thus gives the 
analysis 

Table 11. — Approximate x* Table for Canonical Analysis, 


Hoot. 

d.f. 

x*. 

/?.« 

8 

33-06 


6 

3-12 

Total 

14 

36-18 


It is of interest that the first root removes the significant treatments effects. . From equation (8) 
the linear function corresponding to this larger root is given by the consistent pair of equations 

58,797- lii, f- 31,477-2^/2 - 0, 

31,477-2 //i + 16,850-4 aa - 0, 


giving the function 


jCf (grain) — 0 -535 Xi (straw) 


as the quantity affected by the treatments. Of course whether this variate is considered or simply 
JC2, the yield of grain, depends on what questions the statistical analysis is to answer. 



180 


Bartlett — Multivariate Analysis 


[No. 2, 


Even if we are interested in all significant effects, the canonical analysis of Sg is not always 
necessary. An alternative factorization of A corresponds to the separation of S 2 into two groups 
Sti and 5g.o, say, based on an internal analysis within Sz, An illustration of such an analysis 
was given in my discussion on Wishart’s paper (56) on the “statistical treatment of animal experi- 
ments.” It will be recalled that growth curves had been fitted to the weekly weights of pigs, on 
which the effects of three different food rations were being investigated. The constants g, h, /, 
corresponding to the linear, parabolic and cubic terms in these growth curves, then represented a 
multivariate set for all of which the effects of food and other classifications were to be examined. 
A simple table summary of the significance effects in relation to food is quoted below,* this 
being for the analysis of the log. weights. 


Table III. — Approximate yj Analysis 


Variable. 

iU. 


J? • 

2 

13-28 

h.g. 

2 

0-51 

i.gh 

2 

2-29 

ghi 

6 

16-08 


As explanation of these figures, wc have three dependent variates g, h and /, so that p -- 3; 
and two independent variates (the pseudo-variates corresponding to the two degrees of freedom 
between food treatments), so that q 2. This means that an exact test for A could have been 
made on the basis of \/A (see section 2) but the x* test was convenient. The component items 
correspond to single dependent variables g, It.g and i.gh, which could be tested by the z test, but 
again it was preferred to use the table as shown. This is slightly forced, for the best approxima- 
tion for a variate like g would be 


' - \n id I q I l)j log^ (1 - 


where R is the multiple correlation between the variate g and the two food pseudo-variates, 
whereas the coefficient of lo&.(l /?*) actually used was the same throughout as for the “total” 
item, viz. // ^(3 -|- (7 -j- 1), where here n ^ 21 (19 for “error” -f 2 for “food”), and q 2. 
The best coefficients for h.g and i.gh would similarly have been n — 1 -- i(l -f ^ -f 1) and 
n - 2 - l(i 1“ ^ + 1). This is in contrast with the canonical analysis formula (14), where the 
best coefficient of log#. A"' remains constant. However, the differences in these coefficients are 
not very serious, the analysis at best being only approximate. The important feature is the 
predominance of the first contribution to x" in Table III. If a canonical analysis had been made, 
the X* table would have had the structure 


Total 6 16 08 

where necessarily 16 08 ^ A - 13-28, 0 - B 2*80. It is evident that the first canonical 
variate, which would be the only significant one, would further not be significantly different from 
the natural variate g, since the remaining x* contribution 2 -80, even if concentrated in the 2 d.f. 
of B, or in the remaining 2 d.f. of A, could not reach significance. Thus no further analysis is 
demanded, and the whole of the differences due to food can be regarded as contained in the 
changes in the constant g in the growth-curves of the pig log. weights. 


* With a correction to the third item pointed out to me by Dr. Wishart. It shouM also he noticed 
that the results here are based on the adjusted Figures after elimination of the effect onnitial weights, a 
point 1 had forgotten. This makes n in the ensuing discussion of Table III equal to 21, as noted by Dr. 
Wishart in the general discussion following my paper; this was the value actually used to obtain the 
values in Table 111. 



1947 ] 


Bartlett — Multivariate Analysis 


181 


4 . Theory of discriminant functions 

There is no necessity in the canonical analysis outlined in the last section for the number 
of dependent variates, to be less than q, the number of independent variates (or pseudo- variates) ; 
but if p is greater than </, there will be only q roots /?* not zero, as illustrated in Table III. If 
q ^ 1, we have a comparatively very simple but important case which is worth studying separately. 
The theory of this case was implicit in Hotelling's work (25) on the generalization of Student’s r, 
where the simultaneous testing of the means of several correlated variates was considered. The 
test is equivalent to testing the significance of the angle made between the sample vector S 1 repre- 
senting the independent variate, when this vector is considered to define a direction in / 7 -dimensional 
space, and the vector corresponding to that linear function of the correlated set of variates 
which gives the minimunj angle with the first vector. But this angle is simply cos-‘ /?, where R 
is the formal multiple correlation between Si and Sg, and the linear function is just the linear 
function that would be obtained if a formal multiple regression analysis had been made of the 
single variate Si on the set of variates S^ (the roles of “dependence" and “independence" 
thus being formally reversed). In Hotelling’s case the independent variate is the pseudo- variate 
corresponding to isolating the sample means, i.e. a variate with value 1 for all observations, but 
we have seen that the theory is applicable to all regression problems. The linear function was 
considered more explicitly by Fisher and other workers associated with him, and called a discri- 
minant function, owing to its optimal property in its relation to the independent variate. For 
example, if the independent variate Si is a pseudo-variate with the values f 1 for one group 
and 1 for the other, the discriminant function will be that linear function of the dependent 
variates most able to discriminate members of the two groups. 

Before discussing an example in detail, let us summarize the distributional theory corresponding 
to the above formal analysis. If we assume that the dependent variates S. are jointly normal 
(thereby ensuring “spherical symmetry” for this group of vectors in the sample space), and the 
independent variate Si arbitrary (but known), then a test of no real relation between S3 and Si 
is based on the random relation of S2 to 5i, which is distributionally identical with the relation 
of a random 5 1 to an arbitrary S2, This is just the test of significance that would be used if the 
formal multiple regression analysis of 5i on St is assumed to be a multiple regression analysis for 
which the standard tests of significance will apply. 

This justification of tests used with this formal regression analysis may be extended further. 
For if we ask whether a significant relation of St with Si is entirely due to one (or more) of the 
component variates of 5*2, say 5o, we are led to considering the relation of 52 and Si in the space 
orthogonal to the vector (or vectors) 5o, to which a similar inversion of the roles of S^ and Si 
may be applied. By identifying a single eliminated variate with a hypothetical discriminant 
function, this provides an exact test of difierence of the observed with such a hypothetical dis- 
criminant function. Further, if we separate otf all but one of the variates of Su the remaining 
relation indicates the necessity of including this last variate in the discriminant function; such a 
test is equivalent to providing the coefficient of this variate in the discriminant functions with 
its formal standard error (see (6)). 

Again, if we had two or more separate sets of data, and calculated a discriminant function 
from each, then the test of differences of the discriminant functions would follow the usual re- 
gression method (for in the geometrical repre.sentation the discriminant functions, each with the 
same number of degrees of freedom, would be in orthogonal spaces, and would be transformed 
to the common discriminant function also with the same number of degrees of freedom plus the 
residual random part with the remaining degrees of freedom). 

Even if St has a real (uheliminated) association with Si — for example, in the difference in 
means of two groups a real difference in means may exist- this duality of the relation of 52 with 
5i still holds and links the distributional theory with the corresponding distributional theory 
when 5i is the dependent variate. In the difference in means problem the population parameter 
is equivalent to Mahalanobis’ generalized distance between the two groups, and the distribution 
of the corresponding sample estimate, first given by Bose and Roy (12), will be identical with one 
of Fisher’s distributions of the multiple correlation coefficient, viz. the case where the correlation 
is due to regression on fixed variables. 

This duality is, however, less obvious in the case of i*eal association, and in fact must not be 



182 


Bartlett — Multivariate Analysis 


[No. 2, 


assumed necessarily to hold in the more general case of the previous section where St denotes 
more than one variate. In this more general case (see ( 8 ) and section 6 of this paper) it may be 
shown that for Si fixed duality only holds if not more than one true root is non-zero. This ca^ 
includes as special cases: (i) The case above where Si denotes one variate, and hence there is 
at most one non-zero canonical root, (ii) The case of no real association (cf. ( 3 ), p. 338). 

5. Discussion of an example from anthropometry 

As an example of a discriminant function I propose to discuss the anthropometric data first 
analysed in relation to discriminant functions by M. M. Barnard ( 2 ). My calculations for this 
example were first intended for a draft chapter in the co-operative work envisaged just before the 
war and referred to by Kendall in his Preface to Vol. I of his treatise. Kendall has since made 
use of them in his own chapter on multivariate analysis, but as the treatment below is somewhat 
fuller than that given by him, it is jjerhaps still worth giving as illustrative material for this paper. 

In Barnard's investigations of the changes taking place with time in four series of Egyptian 
skulls, four variates were selected from a larger set as providing significant information. The 
four sei ies were formed of skulls from the Late Predynastic, Sixth to Twelfth, Twelfth to Thirteenth 
and Ptolemaic dynasties respectively. The four selected variates were*: 

Xi maximum breadth. 

X 2 basialveolar length. 

jCa nasal height. 

x^ basibregmatic height. 

It is required to combine these four characters into a linear function which shall best discri- 
minate the effect of time changes ; that is, to maximize the contribution to the total sum of squares 
in an analysis of variance due to the regression on time, relative to the variance within series. 
Let this function be estimated to be 

a'x “ OiXi -f- a 2 X 2 f 03 X 9 + a^Xi, 

The relevant data for the four variates separately are summarized in Tables IV and V, which 
give the means for the four series, and the tot^l sums of squares and products within series, 
respectively. 

Table IV. — Means in the Four Series 

Vaiitttr. SnloH 1 (w, 9J). II («, i(J2). Ill (n, = 70). 

Jii 133-582418 134-265432 134-371429 

X 3 98-307692 96-462963 . 95-857143 

.V:, 50-835165 51-148148 50-100000 

.Y, 133-000000 134-882716 133-642857 

Table V. — Sums of Squares and Products within Series (394 d.f) 

.V, 9661-997470 . 445-573301 1130-623900 2 148^^5842 10 

.Yi — 9073-115027 1239-221990 2255-812722 

.Ya . — — 3938-320351 1271-054662 

a -4 . — . — . — . 8741-508829 

The relative times between the series were taken (from other evidence) in the proportion 
2:1 : 2 , so that in order to obtain the regression on time, the values of the time t may be taken 
as - 5 for all observations in Series I, ~ 1 for Series II, + 1 for Series III and + 5 for Series IV. 
With these values, the mean value of t for the 398 observations is —0-432161, so that the values 
t ~t (corresponding to the components of the single vector Si in the notation of our general 
theory) are for the four series respectively, - 4-567839, — 0-567839, + 1 -432161, — 5-432161. 

* 1 arn indebted to Dr. Herden (see discussion) for pointing out an earlier misidentification of these 
variates with the figures in Tables IV and V. This has now. been corrected. 


rv (»/4 - 7:>). 
135-306667 
95-040000 
52-093333 
131-466667 



19471 


Bartlett — Multivariate Analysis 


183 


There are altogether three degrees of freedom among the four series, but only one of these 
corresponds to the linear regression on time, and (following Barnard’s original investigation) we 
shall for the moment discard the other two, so that with the elimination of the general mean, 
three degrees of freedom have been eliminated, leaving 91 *4- 162 -f 70 4- 75 — 3 — 395, of 
which 1 corresponds to the regression on time, and the other 394 to the variation within series 
given in Table V. 

The sums of products of x i, a' 2 , Xti and Xi with t — r are respectively, 

718-76286, - 1407-26075, - 410-10194, - 733-42758; 

and the sum of squares S(r — tY is 4307-66832. We have seen that the coefficients «i, 
and Ui may be obtained by our carrying out a formal regression analysis of / — / on jci, .Va, Xa 
and Aj. To do this, we need the matrix of sums of squares and products for the total number 
of degrees of freedom (395). It is a slight complication in this example that this matrix has not 
arisen automatically, and it would in some ways be more convenient to solve the equivalent 
equation where we replace the matrix of total sums of squares and products by that within series 
(it is readily shown by matrix algebra that the solutions are equivalent, differing merely by a 
constant factor). However, the later exact tests based on the regression duality follow more 
easily if we keep to the regression analysis, so that the matrix of total sums of squares and products 
(395 d.f.) is given in Table Vf, obtained by adding to the terms of Table V the terms with 1 d.f. 
corresponding to the regression on time. For example, the first term 9781-927828 is obtained 
by adding to the first term 9661-997470 in Table V the value (718 •76286)'V4307 *66832. 

dj\) 

2026-266952 
2405-414318 
1201-230304 
8866-382928 

The reciprocal matrix obtained by solving the equa^ons for the coefficients r/a, <74 with 

I, 0, 0, 0 
0, I, 0, 0 
0, 0, I, 0 
0 , 0 , 0 , 1 

on the right-hand side in place of the sum of products of Xu A 2 , JCa and Xx with r — r, is shown in 
Table VII. 


Table VI. — Total Sums of Squares and Products (395 


9781-927828 

210-762489 

1199-052135 

— 

9532*849476 

1105-246827 


— 

3977-363203 


Table Vll. — Reciprocal Matrix of Sums of Squares and Products ( •" 10") 


110 -.368975 

6-938481 

-28145236 

23-361935 

Xi . 

115-693529 

-24-948984 

-30-767069 

As . — . 

— 

273-988409 

-23-666591 

Xx . — ' . 

— 

— 

129-990069 

Multiplying this matrix by the sum 

of products of Ai, 

A 2 , Aa and Xx with / — /, 

we obtain the 


solution 


ax -- -I- 0-075156739 


a. -0-145490050 
a., . -! 0-144600884 
ax - - 0-078538419 



184 Bartlett — Multivariate Analysis [No. 2, 

The significance of the relation of the variate/A-j, Xz, x^ and 0^4 with time is most easily tested by 
continuing formally with the regression analysis of the vector Sx (i.e. the variate t — 0 on jci, Xi, 
X 3 and Xx, The multiple regression component with four degrees of freedom is 

Ox 718-76286 + x - 1407-26075 i a^ x 410-10194 
- 733-42758 - 375-6657. 

The formal analysis of variance of / /, providing from the theory indicated in the previous 

section an exact test of significance, is thus (Table VIII): 



Table VIII. 

(i.f. 

Sum ol 
sqiian-s. 

Mean snniarc 

Regression on a ,, .y., Y;, and x^ 

4 

375-6657 

— 

Remainder .... 

391 

3932-0026 

10-0563 

Total 

395 

4307-6683 



The exact standard errors of the coefficients, in the sense explained in section 4, are obtained 
in the usual way. For example, for r/i, we multiply he error variance 10-0563 (391 d.f.) by the 
appropriate term, 110-368975 10®, in the reciprocal matrix, and take the square root. In 

this way we obtain finally 

Ox + 0-0752 4: 0-0333 
r /2 - 0-1455 + 0-0341 
r/H - 1 0-1446 + 0 0525 

r /4 - 0-0785 + 0 -0362 

Sin<^ every coefficient exceeds twice its standard error, we may say at once that every variate is 
useful in our discriminant function. As..in direct regression analysis, these standard errors do 
not, of course, give us the whole story, since the coefficients arc not independent ; their use is, 
as we have seen, merely one aspect of the various analysis of variance tests that may be relevant. 

The coefficients above differ from those given by Barnard, since she took an unweighted re- 
gression of the variates a:,, x^, Xz and a:4 with time, instead of a weighted regression. The difference 
this makes is, however, slight, as is shown by the following comparison when the coefficient of 
Xi is adjusted to be unity. 



AbciM* aiiaIvKis. 

Banian l. 

ax . 

( 1 - 000 ) 

( 1 - 000 ) 

a., . 

1-935 

1-938 

Uii 

1-923 

2-005 


1-044 

1-062 


The practical interpretation of the above analysis appears to me to need some care. We 
have established a linear function for discriminating between the four series, with particular 
reference to their time-order, based on the internal variability of the series. But in doing so we 
dis^rded two degrees of freedom representing further differences among the means of the four 
series. Let us bring these back and analyse the total variability among means on the basis of 
the more general theory of sections 2 and 3. We shall not yet make a canonical analysis, which 
would lose the discriminant function just worked out, but merely examine the significance of 
the variability among means in the remaining two degrees of freedom. We can do this by means 
of the A criterion. 

First let us translate the significance of the discriminant function available from Table VIII. 
From this Table the exact significance may of course be estimated by means of Fisher’s z test. 
As a partly independent calculation of the same significance, we have the component factor of A 



1947] 


Bartlett — Multivariate Analysis 


185 


arising from the regression on / ^ / given by the ratio of the determinants of the matrices in 
Tables V and VI, i.e. 


0-2^69054 X 10« 
0-26587779 x 10« 


0-9127914. 


This checks with the ratio of residual to total sum of sq^uares in Table Vi 1 1, i.c. 


3932-0026 

4307-6683 


0-9127914, 


indicating the identity of the two tests. Using, however, the /: approximation, we have 

-1395 ^(1 +4 + D) log, 0-9127914 

- 35-81 (4d.f.) 


Alternatively, let us now calculate A for all the differences among the means. To do this 
we construct the matrix of sums of squares and products about the general mean, as in Table IX, 


A' I 


Tablf IX . — Sums of Squares and Products about the General Mean (397 d.f.) 


< !• 

.r,. 

'a. 

j-,. 

9785-178098 

214-197666 

1217-929248 

2019-820216 

— 

9559-460890 

1131-716372 

2381-126040 

— 

— 

4088-731856 

1133-473898 

— 


, 

9382-242720 


and thus calculate the ratio of the determinants of the matrices in Tables V and IX, i.e. 


This gives 


A 


0-24269054 < 10« 
0-29544775 x 10“ 


0-8214344. 


/:■* - <397 - i(3 -H 4 -}- 1)} log, 0-8214344 

- 77 .30 (12 d.f.). 


The difference between Xi'* and x“ is due to the missing factor A" in 


where 

and 


A - A 'A ", 

A" 0-8999148 

ys - [397 i(2 f 4 f 1)} log, 0 -8999 1 48 


41-50(8 d.f.) 


Here the additive property of x‘“ has not been forced (as in Table 11), so that x* =5^ Xi* Xi 
exactly, but the difference is .trivial. The important point is the highly significant value of Xa*. 
indicating further significant differences in the means when compared with the internal variability 
within series, in addition to that earlier attributed to time changes between the series. 

This indicates that while the linear function we earlier constructed may efficiently discriminate 
the series with particular reference to their time order, we must be more cautious about ascribing 
such differences between the series to time, as further differences of comparable magnitude exist 
between the series unconnected with the time order. 

Ir is of some interest to make an analysis of x* in terms of the canonical roots (cf. the example 
in section 3), to check that there is no single linear function which isolates the differences in means 
if we do not correlate them with the time order but consider simultaneously all the mean differences. 



186 


Bartlett — Multivariate Analysis 


[No. 2, 


From the value of A we know that no factor \ is less than 0 *82. Taking the matrices given 
in Tables V and IX, which in matrix notations are C22.1 and respectively, we consider equation 
(10) in the form 

I c„., (I -mc„\ =0. 

The determinant in this equation was evaluated for 1 --/?* = 0-80, 0-81 . . .0-85, as shown 
in Table X. 


Table X. — (Determinant x 10 '*) 


1 - «*. 


A 

A*. 

A». 

A*. 

0-80 

135600262 

35092410 




0-81 

100507852 

28050898 

7041522 

-978237 


0-82 

72456964 

'21987603 

6063295 

-907327 

70910 

0-83 

50469361 

16831645 

5155958 

-836420 

70907 

0-84 

33637716 

-12512107 

4319538 



0-85 

21125609 






The fourth difference should be constant, and enough values have been taken to provide a 
check. The values of the determinant can now be built up for all values of 1 -- Since there 
are only three degrees of freedom in 5i, the largest root for 1 - is necessarily unity. Apart 
from this value, we locate the roots at about 0*89, 0*94 and 0-98. By inverse interpolation, we 
obtain the more precise values : 

1 - - 0-89017356, 1 - = 0-94038339, I -- R,^ = 0-98130990. 

This gives the x-* analysis for A shown in Table XI, the total checking with the value obtained 
earlier. 


Table XI 


Boot. ■- 

(l.f. 



6 

45-72 


4 

24-16 

/Ja® . 

2 

7-42 

Total 

12 

77-30 


It is evident that no single root is able to isolate the mean differences. 

I shall not attempt to investigate the situation further, but it is perhaps relevant to note the 
scientific caution urged by Chattopadhyay (17) in the interpretation of statistically significant mean 
differences in anthropology, owing to observers’ bias, unrepresentative samples, and so on. In 
the present case, although the second and fourth series were measured by the same workers, they 
display from a casual inspection differences in means of the same order of magnitude as witft the 
other series so that the differences will be presumed genuine ; this still leaves, however, the repre- 
sentativeness of the series as evidence for secular change doubtful, in view of the further differences 
in means noted. 

6 . The General Sampling Distribution of the Canonical Roots 

An important advance in the sampling theory of the canonical analysis outlined in section 3 
was made in 1939 with the derivation of the distribution of the canonical roots (Fisher (21), 



1^7] Bartlett — Multivariate Analysis 187 

Hsu ( 29 ) and Roy ( 43 )) in the case of no real association between and 5i. This remarkable 
distribution, obtained on the usual assumption of the normality of the dependent variates but 
showing the now familiar duality in form for interchange of dependent and independent variates, 
may be written 


C fr (1 - -7 P-^) [ II - /?/)] ciRi^ 

t=l + l 


P 1 

where C = n 

t = 0 


r Tci r[^(// - /)] 

\m(P - 0 ] : I)] Tiiin - 


^ - 



, (P < Q, 


n ' p -f q). 


(!5) 


Being a simultaneous distribution, it does not provide at once tests of significance for the canonical 
roots, but further investigation of the latter problem has since been made — for example, by 
Roy ( 46 ). A more complete grasp of the efficiency of possible statistics and tests is only possible 
on the basis of the canonical correlation distribution when real association between St and Si 
exists. A general method of obtaining this distribution for the regression and correlation 
problems considered in this paper I have given elsewhere ( 8 ). This distribution, like that of a 
single multiple correlation coefficient, has two forms corresponding to (i) the true correlation 
case, and (ii) the regression case on fixed variates, this including such special cases as non-zero 
means. For example, in the case of only one non-zero true canonical root, the distribution has 
in both cases a factor depending on this root involving a generalized hypergeometric function. 
Equivalently, in operational form, this factor is 


11(1 ^ ‘)n(i (16) 

i-- 1 

in case (i) and 

lie ip^q; z U (1 - 1 (17) 

/= I 


in case (ii), where H denotes the operation of taking the term independent of z, and the F functions 
are defined by 


Fiuu ciz \ bx, bi \ x) = 1 f 


axQtX ax{ax H- I) ^2(02 + 1 ) a-® 
bxbt bxihx + 1 ) ^ 2(^2 4" 1 ) 


F{a\ bx,bt\ x) 


ax . a(a 4 - 1 ) x^ 
bibt bx{bx 4- 1) btibt HD’ 


The remaining factor in the distribution is simply the distribution (15). If more than one true 
root is non-zero, the distributions become more complicated. 

Theoretically these results allow the further investigation of the distributional properties of 
particular sample roots and statistics to be made. Thus Roy, who first obtained for case (ii) a 
result equivalent to (17) above, has further investigated the distribution of the largest root from 
this distribution (Roy’s work ( 45 ), ( 47 ) and ( 48 ) should be read with the reservation that it does 
not apply to the case of more than one non-zero root ; see Anderson (i) and Bartlett ( 8 )). How- 
ever, much work remains to be done before the statistician can conveniently make use of the 
results sketched in this section for practical purposes. 

One particular point that Yieeds further investigation, as I have stressed elsewhere ( 4 ), is the 
validity of the x* approximation for A after the removal of one or more canonical roots (see also 
( 7 ) and ( 31 )). It is easily demonstrated that the approximation fails if the true values of the 
roots removed are zero. For example, if 2, ^ = 3, we easily obtain the probability integral 
of from (15), as (1 RtV^^^y of - 2(« — 3) log^ (1 - R^) is a x* with 2 d.f., whereas the 
X* approximation used earlier would have taken only half this quantity, i.e. — (« — 3) log#. ( 1 ' R^) 
as a X* with the same number of degrees of freedom. An interesting property, however, that 
seems worth noting pending investigation of the more exact significance values, is that the x* 
analysis becomes approximately valid even for zero true roots if the sample roots removed are 



188 Bartlett — Multivariate Analysis [No. 2 , 

large. For from (15) the distribution of /?,*(» -- s -1 I . • • /») for given =!...*) tends 
as 1 to the form 

C n {(«<*)!<« [ 11 {Ri^ - R}^)] dR^} . ( 18 ) 

which is the same form as ( 1 5) with p -- s for p,q s for q, and n for //. Hence the x* approxima- 
tion proposed, viz. 

X’’- - {w -- i(p + « t l)}log« n (I - Ri^), 

i ^ « 

with {p - s)(q — a) degrees of freedom becomes appropriate (apart perhaps from its use of 
n ~ s for //). Fn large samples the limiting sufficiency of the largest roots for the non-zero 
true roots implies further that the distribution (18) will remain approximately valid when the 
Rf are large whether the true roots are zero or not, but it is curious that provided the sample 
values arc large, the question whether the true values are large seems to become comparatively 
unimportant as far as the test of the remaining roots is concerned. 


7. Further Notes 

In this final section are added some miscellaneous notes. 

(i) Attention has been concentrated in this paper on the problems arising in the relation of 
one group of variates with another. This, in the form of analysis of variance and covariance, is 
perhaps the most important class of problems that arises, but is not, of course, the only one. 
A second class that has received considerable attention in the literature is the internal ‘’canonical 
analysis” of a group of variates associated with the ideas of “factor analysis,” and especially 
used in psychology. This includes Hotelling's method of analysis into “principal components” 
(26) ; for the distributional theory of this analysis, see, for example, Wilks (54). 

(ii) In the case of several dependent variates the statistical analysis can become rather heavy, 
and some promise that the data are such as to make the analysis worth while, as in other types 
of statistical analysis, is needed. The assumptions underlying the analysis need watching, for 
example, if these methods are applied to economic data (cf. Tintner (50)). 

(iii) In the exact theory of discriminant functions outlined in section 4, it was noted that the 
values of the independent variates Si must be known. When this is not true this theory will 
not apply, as, for example, was pointed out in (6) in connection with applications of discriminant 
function technique in plant breeding. In the later discussion in that paper on approximate 
standard errors, there is, however, an error which requires correction. 1 have asserted on page 172 
that the composite entries in the varieties row of the analysis of variance and covariance table, 
when inflated by true genetic components, may be regarded as made up of two independent 
random parts, whereas they are actually similar to composite entries in other analysis of variance 
tables based on equal numbers in the different classes in varying like homogeneous random 
quantities, but with composite expectations. Thus formula (13) on page 172, giving the error 
variability due to environment, is obviously incorrect, as it should tend to zero if the number of 
replications of each variety is increased. The correction and further extension of these formulae 
is under investigation by D. N. Nanda. 

(iv) An interesting point was raised in the discussion at Boston following Brown’s paper (16). 
Jt was pointed out by Cochran that covariance may be used with discriminant functions to correct 
for further variates (this follows from the general theory indicated in sections 2, 3 and 4, with the 
corresponding reduction in degrees of freedom for S^. Such an adjusted discriminant function 
will be a linear function of the dependent variates and the concomitant variates, but must not 
be confused with a discriminant function obtained by treating the concomitant variates as further 
dependent variates, an analysis which would be less relevant and therefore less efficient. 1 under- 
stand that a paper illustrating this point on an actual example will be published by Cochran 
and Bliss. 

(v) In section 3 reference was made to an analysis by Fisher (23, Ex. 46.2) in connection 
with methods. of computation. Fisher’s analysis is* also worth studying as an example of a 



1947] 


Bartlett — Multivariate Analysis 


189 


canonical analysis used to determine an eflicient scale for experimental scores first made on an 
arbitrary scale (see also Tukey (51) and my concluding remarks in (9)). 


Bibliography 

(1) Anderson, T. W. (1946), “The non-central Wishart distribution and certain problems of multivariate 
statistics,” Ann. Math. Stat., 17, 409. 

(2) Barnard, M. M. (1935), “The secular variation of skull characters in four scries of Egyptian skulls,” 
Ann. Eugen.y 7, 189. 

(3) Bartlett, M. S. (1934), “The vector representation of a sample,” Proc. Camb. Phil. Soc., 30, 327. 

(4) (1938), “Further aspects of the theory of multiple regression,” ibUi., 34, 33. 

(5) (1939), “A note on tests of significance in multivariate analysis,” 35, 180. 

(6) — (1939), “The standard errors of discriminant function coefficients,” J.R.S.S. {Sapp.}. 6, 169. 

(7) (1941), “The statistical significance of canonical correlations,” Biometrika^ 32, 29. 

(8) (1947), “The general canonical correlation distribution,” Ann. Math. Stat., 18, 1. 

(9) (1947), “The use of transfoi mations,” Biometrics^ 3, 39. 

'^(10) Bose, R. C. (1936), “On the exact distribution and moment-coefficients of the D‘‘*-statistic.” Sankhya, 
2, 143. 

(11) (1936), “A note on the distribution of differences in mean values of two samples drawn from 

two multivariate normally-distributed populations, and the definition of the Z)'-*-statistic,” /61V/., 2, 379. 

(12) and Roy, S. N. (1938), “The distribution of the studentized O^^-statistic,” ibid., 4, 19. 

(13) ~ - (1938), “The use and distribution of the studentized f)-*-statistic when the variances and 

covariances are based on k samples,” ibid., 4, 535. 

(14) Bose, S. N. (1936), “On the complete moment-coefficients of the f)“-statistic,” ibid.. 2, 385. 

(15) ~ — (1937), “On the moment-coefficients of the /)“-statistic and certain integral and differential 
equations connected with the multivariate normal population,” ibid., 3, 105. 

(16) Brown, G. W. (1946), “Discriminant functions.” (Paper presented in the Biometrics Section pro- 
gramme at Boston on December 28.) 

(17) Chattopadhyay, K. P. (1942), “Application of statistical methods to anthropological research,” 
Sankhya. 6, 99. 

(18) Fisher, R. A. (1936), “The use of multiple measurements in taxonomic problems,” Atm. Eugen, 1. 179. 

(19) (1936), “The coefficient of racial likeness,” J. Roy. Anthrop. Soc., 66, 57. 

(20) (1938), “The statistical utilization of multiple measurements,” Ann. Eugen.. 8, 376. 

(21) (1939), “The sampling distribution of some statistics obtained from non-linear equations,” 

ibid., 9, 238. 

(22) (1940), “The precision of discriminant functions,” ibid., 10, 422. 

(23) (1946), Statistical Methods for Research Workers. Edinburgh: Oliver & Boyd, lOlh ed. 

(24) Girschick, M. A. (1939), “On the sampling theory of the roots of determinantal equations,” Ann. 
Math Stat:, 10, 203. 

(25) Hotelling, H. (1931), “The generalization of Student's ratio,” ibid., 2, 360. 

(26) (1933), “Analysis of a complex of statistical variables into principal components,” J. P.duc. 

Psychol., 24, 417 and 498. 

(27) (1935), “The most predictable criterion,” ibid., 26, 139. 

(28) - — (1936), “Relations between two sets of variates,” Biometrika, 28, 321. 

(29) Hsu, P. L. (1939), “On the distribution of roots of certain determinantal equations,” Ann. Eugen., 
9, 250. 

(30) (1941), '‘On the limiting distribution of the canonical correlations,'’ Biometrika, 32, 38.. 

(31) (1941), “On the problem of rank and the limiting distribution of Fisher’s test function,” Ann. 

Eugen., 11, 39. 

(32) (1941), “Canonical reduction of the genera) regres.sion problem,” ibid., 11, 42. 

(33) Kendall, M. G. (1946), Advanced Theory of Statistics. London, Vol. II, chapter 28. 

(34) Madow, W. G. (1938), “Contributions to the theory of multivariate statistical analysis,” Trans. Amer. 
Math. Soc., 44, 454. 

(35) Mahalanobis, P. C. (1927), “Analysis of race mixture in Bengal,” J. Asiat. Soc. Beng., 23, 301. 

(36) (1930), “On tests and measures of group divergence,” J. A.siat. Soc, Beng., 26, 541. 

(37) (1936), “On the generalized distance in statistics,” Proc. Nat. In.Ht. Sci. Ind., 12, 49. 

(38) Martin, E. A. (1936), “A study of an Egyptian series of mandibles, with special reference to mathe- 
matical methods of sexing,” Biometrika, 28, 149. " 

(39) Pearson, K. (1926), “On the coefficient of racial likeness,” ibid., 18, 105. 

(40) Pearson, E. S., and Wilks, S. S. (1933), “Methods of statistical analysis appropriate for k samples 
of two variables,” ibid., 25, 353. 

(41) Rao, C. R. (1946), “Tests with discriminant functions in multivariate analysis,” Sankhya, 7, 407. 

(42) Roy, S. N. (1939), “A note on the distribution of the studentized D*-statistic.” ibid., 4, 373. 

(43) (1939), “^-statistics or some generalizations in analysis of variance appropriate to multivariate 

problems,” ibid., 4, 381. 

(44) (1942), “The sampling distribution of p-statistics and certain allied Statistics on the non-null 

hypothesis,” ibid., 6, 15. 

(45) (1942), “Analysis of variance for multivariate normal populations. The sampling distribution 

of the requisite p-statistics on the null and non-null hypothesis,” ibid., 6, 35. 



190 Discussion on Dr. Bartlett* s Paper [No. 2, 

( 4 (i) (1945), “The individual sampling distribution of the maximum, minimum and any intermediate 

of the p-statistics on the null hypotheses,” ibid., 7, 133. 

( 47 ) (1946), “Multivariate analysis of variance: the sampling distribution of the numerically largest 

of the p-statistics on the non-null hypotheses,” ibid., 8, 15. 

( 48 ) (1946), “A note on multivariate analysis of variance when the number of variates is greater 

than the number of linear hypotheses per character,” ibid., 8, 53. 

(49) Smith, H. Fairfield (1936), “A discriminant function for plant selection,” Ann. Eugen., 7, 240. 

(50) Tintner, G. (1946), “Multivariate analysis of economic data,” J. Amer. Stat. Ass., 41, 472. 

(51) Tukey, J. W. (1946), “Vector methods in analysis of variance.” (Paper presented in the programme 
at Princeton on November 1.) 

(52) Wallace, N., and Travers, R. M. W. (1938), “A psychometric sociological study of a group of speciality 
salesmen,” Ann. Eugen., 8, 266. 

(53) Wilks, S. S. (1932), “Certain generalizations in analysis of variance,” Biometrika, 24, 471. 

(54) Mathematical Statistics. Princeton. 

(55) Wishart, J. (1928), “The generalized product-moment distribution in samples from a normal multi- 
variate population,” Biometrika, 20a, 32. 

(56) (1939), “Statistical treatment of animal experiments,” J.R.S.S. {Supp.), 6 , 1. 


Discussion on Dr. Bartlett's Paper 

Dr. Geary, in proposing the vote of thanks: In regard to his second sentence, we may con- 
fidently assure Dr. Bartlett that he need have no doubts about the practical usefulness bf his 
paper. With the ramifications of multivariate analysis in many directions, theoretical and practical, 
demonstration of the precise relations between different theories is of the first importance. Dr. 
Bartlett is eminently qualified for the role of co-ordinator. I like the lecturer's use of selected 
examples to show how different theories are related, a method which has great attraction for all 
but the purest of mathematicians. The well-chosen example has in it the seed of its theorization. 

As most of us know to our cost, matrix algebra has pitfalls for the unwary, and 1 suspect 
that even the most adroit matricists in secret insert subscripts and superscripts like the rest of us! 
In a previous paper of Dr. Bartlett we were reminded that the use of the matrix convention for 
notational economy does not imply any change in computational method. At the hands of 
Bartlett, Hotelling, Wilks and other researchers, use of the notation has by analogy pointed the 
way infallibly from the single- to the many-variate case. The generalized variance of Wilks in 
the notation SS' is perhaps the best example. 

I would like to make a few observations on the remarkable distribution of the roots of the 
dcterminantal equation, discovered by Fisher and developjed by Hsu, Bartlett and others. In 
one form or another the determinantal equation has made its appearance in many theories, latterly 
in Tintner’s theory of linear relationships between economic series. On the latter point it has 
recently been shown that the Tintner theory is intimately related to Hotelling's principal component 
theory (1933) through the intermediary of the latent root equation. 

There is one aspect of the theory which may be found interesting, but for which no novelty 
is claimed. Let the determinantal equation be given in the form 

F(9) - - I Oij - - (pbij I - 0 

where a,j and b,j are sample covariances from independent samples of and /Za respectively 
from the same p-variate normal distribution, with all universal means zero and universal variances 
and covariances known. Then 

F(9) r- (-)p I hi, \ . ± Up) 

I - I 

where the ai's are easily ascertained functions of principal minors. In this case the Fisher-Hsu 
distribution is 

c(p, flu Hi) P' (fr rpi)iOh - p n {II (1 -f 9,)} - i («i + «a) n d<pi 

t 1 « I i ' I 

where P' II I 9 i — 9 j I 
i>i 

Effecting a change of variable 9 ; to aj, the distribution of the latter is found to be 
C(p, ftu rtt) /> - i)(l H -f , . . -f Op) - + »r,) ^ ^ ^ 



1947] Discussion on Dr. Bartlett's Paper 191 

At firet sight it would appear that this form should be more workable than the 9,' distribution, 
since it is far more simple algebraically and its use would avoid the tiresome computation of the 
roots 9i for significance testing. It is a complication, however, that the aj universe of integration 
analogous to — 4ai > 0 for p 2 is difficult to deal with. 

It is remarkable that no great use has been made in practical work of the latent root distribu- 
tion, despite its theoretical fascination. As far as I know, applications have b^n confined to 
the asymptotic case, to which Bartlett and Hsu have made notable contributions. On this point 
I would like to ask the lecturer which precisely is the theoretical relation between his theory and 
Hsu's theory on the asymptotic distribution (which in both theories is a X®) of the sum of the 
insignificant roots. 

Dr. Bartlett made passing reference to the problem of multivariate analysis of economic data. 
The problem here is to discover and to assess the statistical significance or reality of relationship 
between variates observed at many points of time. There are, broadly speaking, two schools of 
thought. One school is disposed to regard the observations as subject to random error, but all 
the variates entering into the equations as known; the view taken is that it is the economist's 
job to write down the equations in algebraic form and the statistician’s to determine the values 
of the parameters. The other school regards all the observations as free from error but certain 
of. the variates arc unknown and are represented in the equation by an error or residual term, 
one for each equation. Both schools agree that a theory encompassing the two approaches would 
be desjrable, but none has been forthcoming so far. Discriminant analysis might reveal if the 
former school consists to a larger extent than the latter of researchers in whom a practical ex- 
perience in the compilation of economic statistics has bred scepticism as to their accuracy! It is 
no detraction of the valuable and suggestive work of both schools to observe that truth is revealing 
itself in tantalizingly fitful gleams and that a “larger” theory is urgently required for each. I am 
quite sure that Dr. Bartlett's paper will be a help in .this direction. For instance, study of Section 3 
suggests that in the theory from the error-in-variate point of view (in which Tinlner recently 
made ingenious use of the Fisher-Hsu distribution of the sum of the insignificant roots of the 
determinantal equation to determine the number of significant linear relationships), the disad- 
vantage of having to assume the error variances and covariances known in advance might be 
avoidable. 

Sir Cyril Burj, in seconding the vote of thanks, said: 

I should like to voice the appreciation of psychologists and others, besides professional statis- 
ticians, to whom Dr. Bartlett’s methods and illustrations will be extremely valuable. To psycho- 
logists, for whom alone I can claim to speak, two points in his paper will be of special interest. 
The first is his discussion of discriminant functions and their extended use. In applied psychology 
more particularly there are numerous problems which call for this technique. May I take one 
example from the many that presented themselves in the course of the work carried out by psycho- 
logists for the Forces during the war? In the Air Force we may have a large number of men 
classified as bombers and pilots respectively; and we may desire to use one and the same set 
of tests for selecting both. That means we should need to ascertain the best linear function for 
discriminating as clearly as possible between the two groups. The weights, however, would 
seem to be the same, whether we use a point biserial or a normal correlation. Still more 
frequently we might have, not two groups, but several groups to discriminate; in some cases 
these might be capable (as Dr. Bartlett’s groups were) of arrangement in a series ; in others (a 
far more puzzling type of problem) they might simply form co-ordinate classes. Provided our 
test-measurements had first been duly standardized to eliminate any unintended weighting, we 
found (as he has done in comparing his own final figures with Barnard’s) that unweighted 
regressions give almost the same results as wei^ted. But what I think is most important for 
the practical psychologist to bear in mind is his incidental warning about the cautions to be 
observed in applying all such methods, wherever we are not quite sure whether the underlying 
assumptions are adequately fulfilled. 

The second topic he has dealt with more briefly, namely, the internal canonical analysis of a 
group of variates, which psychologists commonly term, perhaps a little inappropriately, factor- 
analysis. Since this type of analysis is capable of fruitful applications both inside and outside 
the field of psychology, many of us will hope that Dr. Bartlett will find it possible to develop this 
topic a little more fully on another occasion. 

I myself believe that there are close and unrecognized analogies between the aims and the 
methods of factor analysis on the one hand, and those of the analysis of variance on the other ; 
and this appears to be confirmed by the approach adopted by Dr. Bartlett. Thus, it would seem 
that the method he has used (in Section 111 of his paper) to demonstrate the canonical reduction 
of the general regression problem could be readily modified to meet the conditions of the internal 



192 


Discussion on Dr. Bartletfs Paper 


[No. 2, 


type of analysis. Here and elsewhere (may I say in passing?) 1 much admire the succinctness and 
lucidity of the treatfnent which Dr. Bartlett has been able to achieve by the use of vector and 
matrix notation, which I think he was among the first to advocate many years ago. 

There is one minor point which will be welcome to the amateur who wishes to make 
practical use of statistics, and that is the encouraging stress that Dr. Bartlett has laid upon the 
use of approximate tests of significance, particularly of the chi-squared type. In factor-analysis, 
for example, the ideal method is undoubtedly that to which Dr. Bartlett has referred— namely, 
the method of principal components. This method, which is now usually associated with the 
name of Hotelling, was first suggested by Karl Pearson as early as 1901, and was in fact the earliest 
form of factor-analysis ever proposed. It is, however, far too laborious to employ in actual 
practice ; and factor-analysis would probably have remained unknown, had it not been for the 
simpler approximate methods later put forward as convenient substitutes by Spearman and other 
psychological workers in this country. But with these approximate methods any rigorous test 
for the significance of factors seems out of the question. Approximate tests have been devised ; 
but they arc hardly ever used, because the psychologist is too afraid that he may be criticized by the 
professional statistician for adopting inaccurate procedures. 

I venture, therefore, to think that Dr. Bartlett’s paper is not only a valuable contribution to 
theory, but of special interest to the working practitioner; and on these grounds I should like 
heartily to second the vote of thanks. 

The vote of thanks was put to the meeting and carried unanimously. 

The Chairman said he had listened to Dr. Bartlett's paper with great interest because it was a 
general survey of a number of problems involving multiple variates in which the solution was 
materially assisted by the use of the approximate x* test first put forward by Dr. Bartlett in 1938 
as a method of dealing generally with the Wilks criterion A. Possibly it was one of the minor 
tragedies of war that this test, which can be applied to values of p and q greater than 2. had not 
yet gained the currency that it deserved. The present paper should serve to give the test, and the 
methods to which it applied, greater prominence, and he looked forward to the time when this 
test, which he believed to be a remarkably good approximation to the exact test, would become 
as well known as Dr. Bartlett’s earlier x'** test for the homogeneity of a set of variances. 

This was a case where the problems confronting the experimenter had been leading before the 
war to almost simultaneous treatment by the theoretical statistician. Dr. Bartlett had referred 
to some calculations he had made in 1938 when he (the Chairman) opened a discussion on the 
statistical treatment of animal experiments. Now in an earlier paper (published in Bionietrika) 
he had said that “were the decisions reached by separate examination of the growth rate and 
change of growth rate figures not so clear-cut, it might be necessary to take these figures {g and 
h - i was added later) together in a simultaneous analysis of variance and covariance, and reach 
a single test of significance of the effect of food (or of sex) on both simultaneously, after the manner 
suggested by Bartlett." The reference here was to Dr. Bartlett’s 1934 paper (reference 3). But 
at that time the test which he had drawn attention to now had already b^n worked out, although 
the paper (reference 4) was not published until after the paper containing the above quotation 
was written. 

This paper continued, “Not only so, but the fact that it is desirable to take initial weights into 
account suggests that Bartlett's method should be applied to the variables derived from g and 
h (and /) when hv, is held constant, and a test of significance derived in the same sort of way as 
in the usual covariance analysis." Dr. Bartlett’s calculations, made at the time of the 1938 
discussion and reproduced in the present paper, did in fact employ the covariance method to 
eliminate the initial weight of the animals, but he fancied that Dr. Bartlett had forgotten that 
fact, because not only wa^ it not referred to in Section 3 of the present paper (where n should be 
21, not 22), but he also referred in Section 7 (iv) to some work of Cochran as if it were a new develop- 
rnent, whereas the only thing new about it was its application to discriminant functions. This 
did not, of course, detract from the value of Dr. Bartlett’s paper, and the point he would make 
in conclusion was that a great stimulus was often to be had in the development of new theoretical 
results by studying the experimental work that was going on all the time. 

Mr. C. Radhakrishna Rao said attempts had been made in recent years to generali 2 :e the 
technique of analysis of variance of a single variate to the case of several mutually correlated 
variables. Except in the case of comparison between two groups for which Hotelling’s distribu- 
tion supplied the complete solution no exact tests had been put forward for practical use. When 
one went through the complicated mathematical symbols and formulae (perhaps for no fault of 
the authors) appearing in the journals on this topic, there came a feeling that there was no imme- 



19471 


Discussion on Dr^ Bartleti'^s Paper 


193 


diate prospect of these being made available for ready use. Dr. Bartlett's illustrations of tests 
in his paper were particularly important in that they satisfied, to a large extent, the needs of an 
applied worker in multivariate problems. But there were a few points on which the speaker 
sought clarification. 

The variables Xu . . . , Xp, . . , , yq might stand for p q mutually correlated characters 
for which samples of sizes /ii, /7a» . . • * nk were available from k populations. An important 
problem which often arose in biometric studies was to test whether all the differences among the 
k populations could be explained by the differences in the x"s only, or, in other words, whether 
any additional information was gained by using the y s in association with the x’s. 

Dr. Bartlett called this an internal analysis of and proceeded to say that an alternative 
factorization of A based on p q variates corresponded to the separation of .S'., into two groups 
So and S 2.0 so that A A 'A"', where A' corresponded to the .v’s. The statistic A"' was used to 
test for the additional information supplied by the y’s. 

As an alternative to this one might set up a multiple analysis of variance and covariance 
table involving the q y"s treated as dependent variates and the p x's as independent variates. 
This supplied, on eliminating the variations due to the ^s, the dispersion matrices : 

(i) If' (v I jc) based on “within" analysis of populations having w 1 f . . . H- w*. - k -p 
degrees of freedom, and 

(ii) M ( V I x) based on “between" analysis of populations having (A - 1 ) degrees of 
freedom. 

The problem was one of comparison of If' and M. The theory of canonical roots could be 
applied so that one could compute the maximum, minimum or any intermediate root X of the 
determinantal equation : 

I + M - Xlf'l --- 0 

The significance of the individual roots could then be tested for. An overall test was provided 
by the ratio | If' | / | [- A/ | . Using Bartlett's approximation the statistic 

-l^n-k - p - Uk + il)] log, I -J, I 

could be used as x** with q(k — 1) degrees of freedom. 

This test, perhaps, differed from Dr. Bartlett’s A" criterion and might lead to a different 
interpretation. It was easy to recognize that in the above problem some of the .v’s might be in 
the nature of concomitant variates. Following this method the problem of Table IJI was identical 
in comparing the matrix M (/, h | g, mO representing the variances and covariances of / and h 
due to treatments when g and h*, the initial weight of pigs, were eliminated, with the matrix 
IF (/, h \ g, w) representing the variances and covariances due to error when g and w were elimi- 
nated. Again, the theory of canonical roots could be employed to test whether g was sufficient 
to explain the differences in the growth curves. An unsymmetrical test using h.g and i.hg was not 
convincing since an alternative resolution i.g and h.ig might reveal a different story. On the 
other hand, a complete test for /, g, A, after eliminating w was supplied by the comparison of 
matrices A/ (g. A, / | w) and W ( A, / | w). 

The second problem to which he drew Dr. Bartlett's attention was associated with his treat- 
ment of Barnard's data. There were four variables, .Xi, x^ and a’ 4 , and four groups. The 
object was to test whether the differences in the groups could be explained by linear regression 
of individual variates on time separating the groups. Dr. Bartlett used an alternative factoriza- 
tion of A. 

If they denoted the internal dispersion matrix of Table V by W, that due to within plus re- 
gression of Table VI by IF + /? and the total of Table X by T, they found the dispersion matrix 
of deviation from regression as T W — R with 2 d.f. The matrix W had 394 d.f. and the 
problem was again one of comparing these two matrices. An overall test was supplied by the 
ratio : 

\W\ _ I IF| 

\'T - W - R -^ W\ ~ 1 T - - 1 

In the above example he found: 

\T-R \ = 9665-247740 449 008478 1149-501013 2142 197474 

9099-726441 1265-691535 2231 524444 

4049 - 689004 1 203 - 298256 

9257-368621 

= 101^*26873816) 

and W = 10** (-24269054) so that the ratio was *90307456. 

SUPP. VOL. IX. NO. 2. 


o 



194 


Discussion on Dr, Bartletfs Paper 


[No. 2. 


The value of x* in this case was 

— [396'“i{4 plus 2 plus l)]log^ *90307436 =- 40.02 with 8 d.f. whereas x*==^ 41*50 
according to Dr. Bartlett’s criterion. These two tests appeared to be giving nearly equal results. 
He sought Dr. Bartlett’s views on the comparative advantages of these tests he was inclined to 
put forth and his criterion. 

It might be noted that the above test could be applied even if the problems were to test (i) 
whether a parabolic regression with time was adequate to explain the differences, and (ii) whether a 
parabolic fit was better than a linear fit. One had to get the appropriate matrix due to deviations 
from the hypothesis and compare it .with the error matrix. The comparison might be done 
by testing the significance of the canonical roots or their symmetric functions. They thus had a 
generalization of analysis of variance to multivariate problems which may be termed as Analysis 
of Dispersion, 

For further details relating to tests by analysis of dispersion and the exact distribution of the 
A criterion he referred the interested reader to a paper by him due to appear in a forthcoming 
issue of Biometrika, 

Dr. Herdan said that the underlying unity of structure between univariate and multivariate 
analysis of variance had been excellently exhibited in Dr. Bartlett's paper. It might perhaps not 
be amiss to say a few words about the difference between these methods and the different purposes 
they served in practical problems. This would also afford an opportunity of stressing the 
differences in structure, in spite of the fundamental unity. 

Such differences were brought out clearly in certain typical cases of industrial research where 
the various methods of analysis were used to supplement one another. 

To fix our ideas we could think of an investigation into the protective properties of anti- 
corrosive paints. The various compositions of such paints, differ in their chemical and physical 
properties, and it is of interest to know which composition affords better protection. Each 
composition is put on steel panels, which are then exposed to atmospheric influences and are 
inspected at regular intervals for corrosion. Corrosion, however, is a complex concept and it is 
described by various characteristics such as: the complete paint film intact (per cent, by area), 
anti-corrosive paint only intact (per cent, by area), bare but unrusted steel (per cent, by area), 
rusted metal (per cent, by area), depth of pits, loss in weight, blistering, flaking, cracking, etc. 

It was conceivable to apply to a problem of this kind analysis of variance and covariance 
which would give the regression of the change in composition on the various properties descriptive 
of corrosion. But although this was formally possible, it was doubtful whether it would be of 
any practical value, since what one was interested in in such experiments was the overall question 
which composition afforded a better protection against corrosion as described by all the examined 
characteristics together. It was usual, therefore, to sum the marks given for each characteristic, 
either unweighted or arbitrarily weighted, and thus to arrive at a figure of merit for each 
panel. 

There were two ways of improving this by statistical methods : as an objective way for arriving 
at the figure of merit, the theory of discriminant functions, or of canonical correlations, suggested 
itself. The various examination results were recorded as so many variables which, on the basis 
of the intercorrelations, could be compounded into one overall variable — the discriminant 
function. Every panel was then characterized by one number only and the total could be subjected 
to univariate analysis of variance. 

The same was achieved by canonical correlations or by factor analysis where the factors took 
the plaw of the discriminant function. Although, primarily, in this case, we aimed at and obtained 
a description of the variables in terms of the factors, we then arrived by regression or other methods 
at a linear combination describing the factors in terms of the variable. Having done this, we 
could then apply univariate analysis to find out which chemical ingredient or physical properties 
in the various compositions made for better protection. Thus, multivariate analysis was here 
used as a preliminary to univariate analysis of variance. 

Wherein now lay the specific function of factor analysis as another form of multivariate analysis? 
We should realize that there was a certain arbitrariness in compounding all variables into one 
linear relation as was done in arriving at the discriminant function. This could be partly 
cured by constructing two or more discriminant functions, but even so we were arbitrary 
m grouping the variables. In certain cases the linear combination of variables was desirable 
but arbitrariness in grouping must be avoided. The testing of rubber might serve as an 
illustration. Rubier was, at various stages of manufacture, subjected to various tests like 
abrasion, elongation, hardness, resilience, indentation, stress, benzene swelling, tensile 
strength. Ail these tests were highly correlated with one another, some positively, some negatively 
and It seemed desirable to arrive at what might be called the true dimensionality of the problem 



1947] 


Discussion on Dr, Bartletfs Paper 


195 


by ascertaining whether the number of tests could not be reduced to a comparatively small number 
of factors. It would, however, be risky in cases like this to combine all the tests into one linear 
combination or into arbitrary groups. What we needed, therefore, was a method for arriving at 
the true dimensionality of the tests. This was precisely what could be done by means of factor 
analysis, since factor analysis was essentially a method by which to ascertain whether an assumed 
pattern of grouping of variables was justified. It was the essence of factor analysis to confirm 
or invalidate the provisional allocation of variables to groups of linear combinations. It yielded 
a structure and a pattern, whereas in discriminant functions, although we had a structure, the 
pattern was arbitrary. For testing rubber, therefore, it would seem desirable to reduce the 
number of tests by legitimate methods, like factor analysis, to the smallest number of factors 
compatible with the complex of correlations, and then to subject each of these factors to analysis 
of variance with the object of finding out which ingredients or properties made for significant 
test results. 

Regarding factor analysis as one case of multivariate analysis threw an interesting light on 
the objection frequently raised against factor analysis that factors are only mathematical con- 
structs, and thus have no reality. It was readily conceded that the identification of factors was 
no longer part of factor analysis, and could only be achieved without factor analytical methods 
with some degree of uncertainty. But, according to the interpretation just given, that objection 
seemed beside the point. The compounding of a number of observable variables into one or 
a few factors was not meant to yield again an observable variable, and could not be expected to 
do so. All we could expect to get in this way was a hypothetical variable which might or might 
not correlate with something in nature. The justification of factor analysis was that it facilitated 
description, to which Dr. Herdan believed attention was first drawn by Professor Cyril Burt in 
his Factors of the Mind. It had beep said by H. Poincare that “geometry is not true but con- 
venient.” Why should we expect more from factor analysis? 

It appeared that in certain respects multivariate analysis was the reverse of univariate analysis. 
Whereas in univariate analysis we aimed at splitting up the total variance into partial variances 
corresponding to the various causes which produced the total effect, in multivariate analysis we 
combined a number of variables linearly into one new variable. Thus, in univariate analysis 
of variance we had one effect which we explained as depending on a number of causes, whereas in 
multivariate analysis of variance we had one cause whose regression upon a number of effects 
we wanted to establish. This could be clearly seen from the illustrations in Dr. Bartlett's paper : 
the variables “treatments,” “food,” and “time” occupied purely formally the position of the 
variables whose total variance was analysed into two parts, one being the chance residual and the 
other representing the amount of the total variance accounted for by the regression of “treat- 
ments” upon “grain” and “straw,” of “food” upon the constants /?, /, and of “time” upon the 
changes in the skull dimensions xu a's, x^ and thus appeared as the “dependent” variables.' 
Yet there could be no doubt that they were, in fact, the physical causes of the changes in the 
“independent" variables which represented their physical effects. All this suggested that a suitable 
term comprising the various forms of multivariate analysis of variance would be “Synthesis of 
Variance.” 

There was an error in Dr. Bartlett's paper which might be confusing to those interested in 
the changes of skull measurement with time. Incidentally the same error appeared in Mr. 
Kendall's Advanced Theory of Statistics, Vol. II. Although the order of the quantities in Table IV 
and in the final results was the same as in the original paper by Miss Barnard, the order in the 
description of the tests (p. 182 of Dr. Bartlett's paper) was not. The characteristics in question 
corresponded to those denoted by Xi, x^, x^ and x^ in the original paper, and in this order: 
maximum breadth, face length, nasal height, basibregmatic height. 

According to the original paper, there was an increase with time in maximum breadth and 
nasal height and a decrease in the other two characteristics. On the basis, however, of the list 
given in p. 182 of the present paper there would be an increase of face length and maximum 
breadth and a decrease of the other two. 

Dr. C. A. B. Smith, drawing attention to discriminant functions and the discrimination between 
two populations, said he would like to ask Dr. Bartlett what assumptions were being made in the 
application of multivariate analysis to the theory of discriminant functions. It was well known 
that in ordinary significance tests the departure of the underlying distribution from normality 
was not important, because there would be a tendency to normality in the distribution of, e.g„ 
the mean of a sample, but in discriminating between two different groups they were discriminating 
the single individual each time, and the departure from normality would, as a result, make the 
discriminant function less powerful. In fact, if the variance and covariance were not the same 
in the two groups, the usual linear function would not be the best function or even the best linear 



196 


Discussion on Dr. Bartletfs Paper 


[No. 2, 


function. It was even found in the data with which they had to deal at the Galton Laboratory 
(psychological and biometrical data) that the distributions were seriously non-normal, and in 
dealing with normal and abnormal subjects, the abnormals frequently had greater variance than 
the normals. He asked, therefore, if anything was known as to how much the tests would be 
affected by non-normality. 


Mr. Moroney had met with an industrial problem which could be described in terms of 
discriminatory analysis. It related to the appraisal of cements used in metal to rubber bonding. 
The various measurements which could be taken during manufacture gave a guide to subs^uent 
performance, but certainly no guarantee that the cement would stick well. In facing this type 
of problem he had been led to the idea of predicting performance from the complex of measure- 
ments by the use of some discriminatory function. He wondered whether Dr. Bartlett knew of 
any similar application in the industrial field. He was sure industry offered great scoj^e for this 
branch of theory. 

Dr. Bartlett, replying to the discussion, said that in regard to Dr. Geary's query about the 
relation of his and the approximation obtained by Hsu for the sum of the squares of the 
canonical roots, the fact was that the two tests were asymptotically equivalent. They depended 
on the same x* distribution if one made the sample large enough, but the further point he was 
concerned with in the test he had used in the paper was to have a test which was, as far as 
possible, of known accuracy in finite samples. He had pointed out where it had sufficient accuracy 
and where it needed further investigation. 

In regard to Dr. Rao’s remarks on the possible alternative methods of testing, he did not, he 
was afraid, quite follow the exact basis of the alternatives he proposed. He (Dr. Bartlett) would, 
therefore, like to study his remarks in writing before attempting to comment. He was interested 
in Sir Cyril Burt's reference to Karl Pearson, with which he, personally, was not familiar, in 
suggesting that the reduction to principal components went back as far as 1900. 

With reference to Dr. Smith's query about the effect of non-normality and unequal variance, 
that was a question about which he reserved the right to reply. He did not believe it had been 
given much thought, and he fancied the result would depend on the purpose of the canonical 
analysis. Jn the case of the grain or straw yield or the pig analysis he rather doubted whether 
the application of the discriminant function in the sense of which Dr. Smith was thinking was 
necessarily required and, therefore, the known result that analysis of variance which referred to 
the sample as a whole was fairly insensitive to such departures would apply also to the canonical 
analysis, but the case where one merely used the discriminant functions on a further single observa- 
tion was a problem which, he agreed, wanted further consideration. 

He was afraid he could not help Mr. Moroney; he could not place any industrial application 
of the type that Mr. Moroney was asking for. He did not know whether anyone else could help, 
if so, perhaps he would be good enough to write to the Society or to Mr. Moroney giving the 
reference. 

Dr. Bartlett replied further in writing as follows: 

The chief point outstanding seemed to be to clear up any possible differences between Dr. Rao 
and himself. Dr. Rao had not made the point at issue very clear, as his method for his first 
example seemed identical with the method proposed in the paper. This was assuming that Dr, 
Rao’s function M was obtained as usual by subtracting the “within’' term W from the “total” 
term M + IK as was necessary from the k — 1 degrees of freedom assigned to it; in this case 
the total degrees of freedom would be /i p — and not n — k p, and the best numeral 
coefficient // — /> - I - J(A - 1 ] q). Again, in the pig weight analysis, the unsymmetrical 
test of g, //.g and i.gh was deliberate, as the quantities g, //, / were naturally considered in that 
order. 

But Dr. Rao had put his finger on a real point that wanted clarifying. It arose in the g, /r, / 
analysis and also with the skull data, but in principle it had nothing to do with multivariate analysis, 
arising Just as well in an ordinary analysis of variance. The point was simply this. Suppose 
A, B and C were three independent sums of squares, of which the first two were to be tested against 
the residual term C. The complete null hypothesis was tested by the ratio CjiA I B + C), which 
had two alternative factorizations into independent quantities, 

C C B + C _ C 

A ^ B + (; /I + C A B -fc 



1947] 


Discussion on Dr, Bartletfs Paper 


197 


If logarithms were taken* either factorization gave an additive analysis of the left-hand side. If 
B were not inflated by any real effect, the second factor of the first factorization gave the most 
efficient test of A. But if B did contain a real effect, the first factor of the second factorization, 
being independent of B would be, in general, a more sensitive test of A, and would be used 
although no longer additive with the test of B. 

He could not say he liked Dr. Rao’s phrase “analysis of dispersion,” preferring the more 
self-explanatory “multivariate analysis of variance.” 

With regard to Dr. Herdan’s remarks, he was not sure that these clarified the relation of an 
internal factor or canonical analysis with the external canonical analysis he had been mostly 
concerned with in his paper. Dr. Bartlett agreed, of course, that both were cases of multivariate 
analysis, but felt it most important not to confuse people into thinking that an internal factor 
analysis of a set of variates could give any information whatsoever on the external relation of 
this set with a second set. It was still a common fallacy, for example, to assume that the major 
factor emerging from an analysis of a set of marks in different tests was necessarily the best guide 
to a student’s ultimate performance, whereas of course the question is unanswerable unless 
“ultimate performance” is defined, measured and correlated with the different test scores. 

He was indebted to Dr. Herdan and to Dr. Wishart for drawing attention to slips in the paper. 
These he had now corrected in the text. While it was true that he had, as Dr. Wishart suggested, 
forgotten the use of covariance to eliminate initial pig weights, there was a new application 
involved in the application by Cochran and Bliss to discriminant functions, although he agreed 
and had in fact mentioned in the paper that no new principle arose. 

He did not think on consideration that he had anything useful he could add to his spoken 
comments on the question of non-normality raised by Dr. Smith, except to draw attention to 
two recent papers in the Annals of Eugenics^ 13 (1947), from the Galton Laboratory (“Some 
notes on discrimination,” by L. S. Penrose, p. 228; “Some examples of discrimination,” by 
C. A. B. Smith, p. 272). These referred to the effect of unequal variances; the general problem 
of the effect of departures from the standard assumptions would, he hoped, receive further 
investigation. 

Dr. Bartlett concluded with thanks to all the speakers for their contributions to the discussion. 



J98 Anscombe, Godwtn and Plackett — Methods of Deferred Sentencing in Testing [No. 2, 


Methods of Deferred Sentencing in Testing the Fraction Defective of a Continuous 

Output 

By F. J. Anscombe (Rothamsted Experimental Station), H. J. Godwin (University College of 
Swansea), and R. L. Plackett {National Physical Laboratory) 

Summary 

When the output of a continuous process is subjected to a routine test, in which it is observed 
whether the articles sampled are satisfactory or defi^ctive, a clustering of defectives (i.e. the occur- 
rence of several defectives in rapid succession) may be taken as an indication that quality has 
deteriorated, fn Part 1 insp^tion schemes of this kind are considered and compared with sequen- 
tial sampling methods applied to the product after it has been divided into bulks of a suitable 
size. In Part II it is shown how the operating characteristic of a deferred sentencing scheme can 
be calculated. 

These schemes are, we believe, the only inspection procedures available at present that relate 
specifically to a continuous output and apply when the inspection test is destructive. They are 
non-rectifying. 

Mathematically, we are concerned with, among other problems, the frequency of clusters of 
random points on a line of infinite length. 

Most of the work on which this paper is based was carried out by the authors in the Ministry 
of Supply Advisory Service on Statistical Method (S.R. 17), and their thanks are due to the 
Director-General of Scientific Research (Defence) for permission to publish it. Part 1 and the 
first section of Part If arc by F. J. Anscombe, and the remainder of Part It by H. J. Godwin and 
R. L. Plackett, as indicated by author’s initials. 


Part I. — Description of Methods 
Testing Continuous Production 

1. We consider routine tests on a mass-prodiiced article, unde.** the following condil»o*^s. 

(i) We are cofn^crned with testing the output of a single production line. The quality 
of the output is continuous, in the sense that production is not in recognizable batches of 
uniform quality. If a change in quality, temporary or permanent, occurs, it may occur, 
so far as we know, at any time. If it is the producer's practice to despatch his product 
in “lots" bearing a lot-number, the size of the lot is determined solely by considerations 
of convenience in despatching or book-keeping or whatever, and is not determined by 
anything known to have an important influence on quality of output. For example, each 
1000 articles may be boxed up to form a “lot" as they come off the line, simply because 
1000 is a convenient number for book-keeping. (Alternatively, the product may be known 
to be considerably influenced by some factor, such as the shift or the batch of raw material, 
and so naturally falls into batches of relatively uniform quality, but these batches are too 
small in size for adequate individual testing. The uniformity of the batches is then of 
no help to the inspector.) 

(ii) We are concerned with the final test of the articles produced, and the test is destruc- 
tive and costly, and therefore carried out on a small scale. It is a functioning test, i.e. 
an article tested is cither satisfactory or defective; and it is applied to samples from the 
whole of the output. Its purpose is to check that quality is satisfactory rather than to 
weed out bad from good lots. As a result of the test lots are either accepted or rejected, 
where by rejection may be meant outright scrapping or investigation followed perhaps by 
an attempt to remedy the fault. The test may be carried out by the producer or the con- 
sumer. for different reasons (we shall actually consider it from the consumer’s point of 
view). 

2. The method by which this inspection problem has often been met is to select a small number 
of articles, perhaps three or five, from each lot, test them, and if no defectives are encountered 
accept the lot, and if one or more defectives are encountered reject the lot, or select and test a 
few more articles, accepting the lot if no further defects are observed, or some such rule. Thus 



1947] 


the Fraction Defective of a Continuous Output 


199 


each lot is sentenced separately, by a single-sampling or double-sampling method. But the 
number of articles tested is so small that the test is very insensitive. A 10 per cent, defective lot 
would clearly stand a high chance of being accepted under such a scheme, and in some instances 
that have b^n observed a considerably lower fraction defective than 10 per cent, was considered 
unacceptable by the user. Then such an inspection scheme exercises only a moral control of the 
quality. And from the producer’s point of view such a scheme is discouraging, since usually the 
lots rejected will be no different in quality from the neighbouring lots accepted. 

3. One possible method of achieving an effective direct control of the quality, without increas- 
ing the rate of test (i.e. the number of items tested per 1000 made), is to group the product into 
larger bulks than the manufacturer’s “lot”, and apply an efficient sampling method, which might 
be, for example, a Wald sequential scheme (Refs. 1, 2 and 7), to each bulk. Each bulk would be 
accepted or rejected as a unit, and we should have a guarantee of the average quality of the bulk. 

4. An alternative way of tackling the problem is by the method now to be described. It was 
conceived and first applied by H. B. Spalding, P. Halliday and E. H. Sealy, in 1942, in connexion 
with some military stores (Refs. 5 and 6). Since then it has been used fairly widely, and subjected 
to a good many variations. The underlying idea is as follows. 

It is supposed that the quality of the output remains constant in stretches, and that when there 
is a change in quality the change is sudden, or a gradual trend in one direction, but not a rapid 
oscillation. (Rapid oscillations will not, in fact, be observable with a low rate of testing, but 
only the general trend.) Any possible precautions will be taken by the manufacturer to ensure 
that this state of affairs holds, by, for example, using up raw materials in the order of their supply, 
and avoiding a continuous bell system or anything confusing a direct orderly flow along the pro- 
duction line. (Precautions of this kind arc often not possible, however.) If now items are selected 
for test at a regular rate, and if the quality of the product remains constant, defectives will be found 
with a constant probability in a random distribution. This will involve occasional accidental 
clusterings of the defectives, when they will occur in rapid succession. If the quality deteriorates 
the clusterings will become more frequent and pronounced. The method of sentencing consists 
of deciding on a criterion for a significant cluster and rejecting the product represented by such a 
cluster. In general, ^ntence on a lot or batch of the product is not passed as soon as the test 
items from the batch have been tested, but is delayed until the immediately following batches 
have been tested, in case a significant cluster should develop. Such schemes have therefore been 
given the name of deferred sentencing. * 


The Simplest Deferred Sentencing Scheme 

5. The simplest type of deferred sentencing scheme is the following, and this has been the most 
fully investigated. 

Sentencing rule. — The product, as it leaves the line, is divided into small lots, and one 
item is selected from each for test. D and n being given integers, whenever n defective 
items are encountered out of D or fewer consecutive lots tested, all the lots consecutively 
from that giving the first to that giving the A?th defective in the cluster are rejected. Lots 
not rejected by this rule are accepted. 

A run of clear lots, i.e. lots giving .satisfactory test results, of length D will be accepted at once, 
and so will following clear lots. If a defective occurs, sentence will be suspended until either a 
further D — 1 lots have been tested or w -- 1 further defectives have been encountered, whichever 
occurs the sooner. In the first case, if the D — 1 following lots give fewer than n - \ defectives, 
the first defective and any succeeding clear lots will be accepted. As soon as n defectives occur 
in not more than D lots, all lots not so far sentenced will be rejected. Thus sentence will some- 
times be passed at once and sometimes with a delay not exceeding D - 1 lots. Some of the lots 
to be rejected according to the sentencing rule may already have been rejected through the opera- 
tion of the rule on a previous cluster of n defectives partly overlapping the one being considered. 
The actual number of new lots rejected whenever the rule operates may therefore be any number 
from 1 to Z>. 

♦ They were first known by the not wholly fortunate name of ‘‘rational sentencing”. The word “sen- 
tencing” itself is that regularly used by Government Inspectorates to denote their decision to accept or 
rejecflots of ammunition submitted to them. 



200 Anscombe, Godwin and Plackett — Methods of Deferred Sentencing in Testing [No. 2, 

6. We shall choose D and n so that if the quality of the product is satisfactory, there is only a 
very small risk of any product being rejected owing to an accidental clustering of defectives in the 
inspection. We shall also require that if the quality deteriorates to a serious extent, as mych as 
possible of the product will be rejected. In fact, given D, /i, and the fraction defective p of the 
product, assumed to be constant, it is possible to calculate what proportion of the product Will 
be rejected. Solutions were found simultaneously by Godwin and myself, and are given below 
in Part II. Godwin’s approach, the enumeration of all possible acceptable or rejectable combina- 
tions of defectives, appears to be the more powerful, as it has been applied successfully to more 
complicated deferred sentencing schemes (examples at the end of Part II). 

7. The numerical results of this investigation are summarized in the table below. It has been 
calculated on the assumption that the fractions defective are small, i.e. for the limit as p~>0. 
Percentage points of Dp are given, and their interpretation will be clear from an example. Suppose 
it is desired that output that is constantly \ per cent, defective should be almost all, namely 99 
per cent., accepted, in the long run. The appropriate value of D for a given value of n is found 
by dividing the entry for Dp in the 99 per cent, column by the value of p, namely •(X)5. Thus if 
we choose n 5, we shall need D = 194. Reading along the row for w — 5, we see that if the 

fraction defective of the output is constantly i.e. 0*95 per cent., then 90 per cent, of it will 

be accepted in the long run. Similarly, if the fraction defective is 1 *8 per cent., 50 per cent, of the 
output will be accepted, if it is 3 1 per cent, only 10 per cent, will be accepted, if the fraction 
defective is 4*6 per cent, only 1 per cent, of the output will be accepted. 

Percentage Points of Dp 

n V jiluoK of bp for whlcli a kIvcii proportion of tb<* output will bo aocoptod 


rt 

91> IHT 

90 |KT COIlt. 

.'»o p<*r 

10 i»cr ront. 

I per rent. 

3 

•35 

•89 

2-20 

4-5 

7-3 

4 

•63 

1-34 

2-84 

5-3 ‘ . 

81 

5 

•97 

1-84 

3-54 

61 

9-0 

6 

1-36 

2-39 

4-28 

7-0 

10-0 

7 

1-78 

2-96 

5-04 

7-9 

11-0 

10 

3-25 

4-85 

7-43 

10-8 

14-2 


If, for any n, the value of Dp is plotted against the percentage acceptance on logarithmic 
probability graph paper, the points lie on a slightly curved line, and intermediate percentage 
points can be read off (see Figure). 

8. For a given scheme (i.e. given D and /i), the graph of the proportion of the output accepted 
against the fraction defective p, assumdl constant, may be termed the operating characteristic of 
the scheme, by analogy with sampling schemes for bulks of a definite size (see for example Ref. 2). 
It should be noted, however, that there is some difference in kind between the two concepts. 
With a deferred sentencing scheme the operating characteristic shows the proportion of the output 
that will be accepted in the long run if its quality is constant, or the chance that any particular 
lot will be accepted, if the quality remains constant for several lots on either side. On the other 
hand, if the product is divided into bulks and a sampling scheme applied to each separately, the 
operating characteristic gives the chance that any single bulk will be accepted (irrespectively of the 
quality of neighbouring bulks), or, if all the bulks should have the same quality, the proportion 
of the output accepted in the long run. 

9. Wc consider now how to choose w, the size of defect cluster in the sentencing rule. If the 
quality of the output remained constant, or only changed very slowly, the properties of the sentenc- 
ing scheme would be summed up by the operating characteristic. With any given quality of 
output, a certain proportion of the output would be rejected, according to the operating charac- 
teristic, and this independently of the rate of testing, i.e. of the number of articles tested per 1 ,000 
produced. It is obvious from the table above that the larger the value of n the sharper is the 
distinction given by the scheme between good (acceptable) and bad (rejectable) quality; for the 
ratio of the values of p for which there are any two proportions of the output accepted, c.g. 99 
per cent, and 1 per cent., approaches 1 as n increases. Thus the scheme will be most sensitive, 
and therefore most useful, when n is large. 



1947] 


the Fraction Defective of a Continuous Output 


201 


Operating characteristics of simple deferred sentencing schemes 



But in fact we are not much concerned with constant quality of output, excepting constant 
good quality, but rather with possible sudden deteriorations in quality, of which we require that 
the sentencing scheme shall give early warning. A sudden deterioration that is sufficiently serious 
will be detected sooner with a small value of n than with a large one. So we have two opposing 
principles, sensitivity and flexibility, requiring respectively a large and a small value of tu Of 
course, the flexibility of the scheme (its power to detect a change in quality quickly) can be 
increased by increasing the rate of testing; but when that has been done as far as circumstances 
permit the choice of n will still be a matter of judgment. 

10. When deferred sentencing schemes were first proposed, the intention was that the value 
of D in the sentencing rule should not be based on a fixed standard of good quality, but on the 
recent average quality of the process. The latter was to be estimated from a run of test results, 
and D would be chosen so that if the quality stayed at its previous level the product would have a 
stated chance, such as 99 per cent., of being accepted. It was argued that the quality was expected 
to stay constant, and that the prime object of the inspection was to detect any change in quality. 
If a change were detected the reason for it should immediately be sought, even if the quality were 
still acceptable; and meanwhile, if the product were not rejected at once, sentence should be 
suspended until it was clear what the new quality level was. If the quality became stable at a 
new level, and this was acceptable, a new value of D would be used appropriate to the new level. 
This reasoning is in line with the theory of ordinary statistical quality control (see for example 
Ref. 4). • 

But against this practice there were two difficulties. If a significant cluster of defectives in 
the test led to outright rejection of the relevant part of the output, or at least to suspension of 
sentence pending investigation, and if the consumer was applying such a method of inspection 
to the output of two factories making the same article, the better of the two factories was more 
severely treated than the worse, in that relatively poor product from the better factory might be 
reject^, while equally bad product from the worse might be accepted. A good deal of explanation 
was then necessary. And also it was found that the inspectors who operated the sentencing rule 
were unwilling to recalculate the value of D from time to time to allow for small changes in the 



202 Anscombe, Godwin and Plackett— of Deferred Sentencing in Testing [No. 2, 

general level of quality. It became clear that the ideal of constant quality, “the best the factory 
could do”, must be replaced by a stated standard of acceptable quality, arrived at by considering 
the quality of available supplies and the consumer’s needs, which should apply equally to any 
producer ; and that the method of sentencing should not be made a means of bringing pressure 
to bear on the producer to improve his quality beyond this standard. 

It should be noted that even if standards of quality arc laid down the inspection scheme is 
not likely to be applied entirely without regard to the quality actually achieved. If the average 
quality of a factory is poor, relative to the standard of good quality, a certain proportion of the 
output will be rejected, and either the standard must be relaxed or the factory will stop work, 
it is possible to insert a clause into the contract to take explicit account of the average quality 
of the output, to the effect that, in addition to the control afforded by a deferred sentencing scheme 
with fixed constants (£>, //, etc.), the average fraction defective of the output shall be calculated 
from time to time by pooling the test results of the last so-many-hundred lots; this should not 
exceed a stated figure, and if it does so sentence may be suspended on further output while the 
question of relaxing standards is being considered. 

The problem of deciding on standards of good and bad quality arises in all sampling inspection. 
For a discussion see Ref. I . 


Other Deferred Sentencincj Schemes 

1 1 . We turn now to possible developments of the simple deferred sentencing scheme so far 
considered. 

(i) Retest , — When a number of consecutive lots have been rejected, they may be treated as 
a single batch and given a further sampling inspection on a more generous scale. A sequential 
scheme can be used, for instance. The operating characteristic of the resulting procedure can be 
very easily deduced, since the chance that any lot shall finally be rejected is the chance that it will 
be “rejected” by the simple deferred sentencing scheme multiplied by the chance that it will be 
rejected in the subsequent retest. 

The arrangement suffers from what may be a serious practical disadvantage, namely, that 
after the delay involved in the “rejection” under the preliminary deferred sentencing scheme there 
is further delay while the retest is being arranged and carried out. It may be preferable to augment 
the rate of testing temporarily whenever a defective is encountered in the initial inspection, in a 
routine manner, without waiting to see whether a cluster of defectives is developing. Thus w'e 
have the following scheme. 

(ii) Two rates of testing. — In the absence of defectives, one item is tesjted per lot. Imme- 
diately after a defective the rate of testing is increased to one item per one-/rth part of a lot (i.e. 
one per lot of one-Ath the previous size). After C consecutive clear items have been found, the 
lower rate of testing is resumed. Whenever n consecutive defectives occur within D or fewer 
items tested, all lots or part-lots consecutively fiom that giving the first to -that giving the /ith 
defective are rejected. 

The operating characteristic of this scheme is given in Part II. We have the following numerical 
results for n ^ 5, C \ D. 

Percentage Points of Dp 


Values of Dp for which a given i>ropr)rtiou of the output will be accepted 


D 

/V 

PW per cent. 

Pt) per cciit. 

50 iKT cent. 

10 p<^!r cent. 

1 per cent. 

0 

— 

•97 

1-84 

3*54 

61 

90 

i 

2 

1-06 

2-00 

3-82 

6-5 

9-5 

i 

3 

Ml 

2-09 

3-99 

6-8 

9-8 


It will be seen that the introduction of the higher rate of testing has slightly sharpened the 
operating characteristic, i.e. the ratio of the 99 per cent, and 1 per cent, values of Dp is nearer 1. 
If such a scheme is compared with a simple scherpe with only one rate of testing, such that the 
average rate of testing of the two schemes is the same when quality is at the 99 per cent, acceptance 
level, the main difference in effect between the schemes is that the first scheme will lead to a 
rejection sooner than the second after a deterioration in quality has occurred, and that then the 



1947 ] 


the Fraction Defective of a Continuous Output 


203 


average rate of testing of the first scheme will be greater. The fact that most of the output which 
is rejected will have been tested at the higher rate may help to render the scheme acceptable to 
the producer. 

(iii) Rejecting on either side of the cluster. — A modification of the simple scheme of §5 that 
might be considered is as follows. One item is tested per lot. D, n and G being given integers, 
whenever n defective items are encountered in D or fewer consecutive items tested, all the lots 
consecutively from that giving the first to that giving the /ith defective, together with the G lots 
next on either side of these lots, will be rejected. Thus one cluster of n defectives may involve 
the rejection of as many as D f 2G lots. 

We should in practice take G to be considerably smaller than D. One might, for example, 
choose G ^ properties of such a scheme have not been investigated, but it seems 

intuitively likely that it would prove useful. 

(iv) Two simple schemes together. — Another modification, designed to satisfy the opposing 
requirements of sensitivity and flexibility mentioned in §9, is to make use of two rejection rules 
like that of §5, with constants Di, //j, and X)*, /i^, applied simultaneously to the same single series 
of test results. We might, for example, have n^ -= 5 and //^ -= 10. The operating characteristic 
of such a scheme seems to be difficult to obtain, and no attempt at investigation has been made. 
The two rejection rules do not, of course, act with statistical independence (output rejected by 
one rule will tend to be rejected by the other also), but an upper limit to the proportion of the 
output rejected is got by adding the proportions rejected by the schemes separately, and either 
of these proportions alone is a lower limit. This remark would probably suffice for most practical 
purposes. 

(v) Several items tested per lot. — A modification of the preceding schemes that will usually be 
made in practice arises from the inconvenience or impossibility of storing the product in small 
identifiable lots relating to single test items, pending sentence. If the rate of test is 5 per 1,000, 
according to the above rules the product, as it comes off the line, should be aggregated into lots 
of size 200, and each lot kept distinct until sentence is passed. Such a small lot size may be incon- 
venient to the manufacturer, if there is any considerable delay in the testing (as there might well 
be if the test were one of endurance), as then he will always have a large number of such small lots 
in his store, each required to be kept distinct. It may be more convenient to label his output 
in lots of size 1,000 or 2,000, and not accept or reject in smaller amounts. And in some cases 
the lot size is determined by the process, as when the product consists of a powder or fluid used 
as a filling for some article. The unit of production heie is usually the mix, and if that is of 
sufficient size to make 1,000 fillings say, five fillings would be selected at random from the mix, 
and there would be no question of subdividing the mix into five parts of sufficient size for 200 
fillings. 

Deferred sentencing schemes in which several items arc tested per lot have not been con- 
sidered in general, but Plackett and Godwin have investigated two particular schemes of this 
kind (see Part II). 

12. The schemes so far considered have been symmetrical, in that the same amount of produc- 
tion would have been accepted if the observations had occurred in the reverse order. This is 
even true of scheme (ii), which appears unsymmetrical at first glance. Actually, of course, we 
encounter the test results in one direction only, and there is no obvious efficiency in symmetry. 
An unsymmetrical scheme that has been used (not, I think, for any very good reason) is like that 
of §5 except that exactly D lots are rejected when a significant cluster occurs, either the D lots up 
to and including that giving the last defective in the cluster, or the D lots starting with that giving 
the first defective in the cluster: Another possible non-symmetrical device is the following. 

(vi) Quick release. — A quick release clause can be inserted in the sentencing rule of §5, to the 
effect that when a defective occurs after a run of clear items for which the corresponding lots 
have been accepted, if the next A items following the defective are clear {A an integer less than />), 
the lots corresponding to these items and the defective one should be accepted forthwith. A can 
be chosen so that the chance that the — /I ~ 1 items following the A clear ones shall contain n — 1 
or more defectives is very small, e.g. 01, if the quality of production remains constant. The 
insertion of the quick release clause will then have almost no effect on the operating characteristic 
of the scheme, but may speed up the sentencing appreciably when quality is good. 



204 Anscombe, Godwin and Plackett— A/e/W^r of Deferred Sentencing in Testing [No. 2, 

13. All these deferred sentencing schemes are based on the idea of picking out clusters* of 
a certain number of defectives which occur within a given space. This sort of criterion has the 
simplicity which was required in such applications of the method as have been made. But the 
reader may well feel that a more comprehensive and less arbitrary rule, in which the decision to 
accept or reject any lot was based on the closeness of defectives on either side, taking into account 
a large number of defectives and attaching varying weights to them according to their remoteness, 
would be more powerful. A first step in this direction is shown in scheme (iv) above, but this is 
still quite arbitrary in its choice of n^ and Wj. 

The practical problem may be crystallized as follows. After a run of uniform acceptable quality, 
the output deteriorates, the quality, let us say, undergoing a sudden change to a new unacceptable 
level, where it remains. We require a sentencing scheme under which the run of acceptable 
quality will almost all be accepted, and after the deterioration there will be a rejection {a) with the 
least possible delay and (b) covering as much as possible of the output after the deterioration. 
One may surmise that no scheme will be uniformly optimum for all possible levels of unacceptable 
quality; one scheme will be more efficient in detecting a small change in quality and another in 
detecting a large change. The solution of this problem has not been attempted, nor the investi- 
gation of any of the obvious criteria that suggest themselves. If the problem of detecting a change 
in frequency in a series of random events should occur in other connexions than routine inspection 
of a factory’s output by non-scientific staff, it may be worth somebody's while to go into the matter 
further. 


COMPARLSON WITH SEQUENTIAL SAMPLING 

14. The considerable diversity that is possible with deferred sentencing schemes has been 
indicated, and a certain amount of numerical information has been given, so that some of them 
could be put into immediate operation. In Part II methods are described for obtaining similar 
information about other schemes. 

It remains to consider what are the peculiar virtues of “deferred sentencing". As remarked 
in §3, an alternative way to arrange the insp&tion is to divide the product into batches of a suitable 
size and test each individually by a sequential method. We must sec whether this is ever a 
preferable procedure. 

(The individual inspection of batches is the only alternative that I am aware of to deferred 
sentencing. Two sampling inspection schemes for continuous production that have appeared, 
the Dodge and Wald-Wolfowitz (Refs. 3 and 8), refer to non-destructive inspection and arc not 
relevant.) 

15. Suppose that the product, as it comes olf the line, is divided into batches of a certain size, 
and a sampling inspection scheme is applied to each batch. The operating characteristic of the 
scheme indicates what is the chance of acceptance of a batch with any given average fraction 
defective. (If the sampling from the batch is random or suitably stratified the inspection will be 
a test of the average quality of the batch, and the batch will, of course, be sentenced as a whole, 
however uneven its quality may be.) 

Let us suppose to begin with that the quality of the output is constant. Then corresponding 
to any deferred sentencing scheme we can find a sequential scheme having almost the same 
operating characteristic. For example, if po is the fraction defective with 99 per cent, chance 
of acceptance, a simply deferred sentencing scheme with 


Po 

and a closed sequential scheme with (in the notation of Ref. 1 ) 

•322 

b h ^ ■ » bfi ”= 2{b + 1), Hi - 3(6 f 1), closed at 4(6 -f 1), 

Po 

have nearly the same operating characteristic: 99 per cent, acceptance at p 90 per cent, at 
P 1 ;9 />«, 50 per cent, at p = 3 -7 p„, 10 per cent, at p = 6-3 p„ 1 per cent, at p = 9-3 p„. (The 

• 3'^2 

sequential scheme is as follows. Take successive samples of size — . The batch is rejected as 



1947] 


the Fraction Defect ve of a Continuous Output 


205 


soon as 4 defectives are found, and it is accepted if there are no defectives in the first two samples, 
or 1 only in the first three, or 2 or 3 in four samples.) The two schemes have just about the 
same effect, at almost any rate of testing, as long as the quality of output remains constant. 

In order to compare the merits of the schemes we consider what happens when there is a change 
of quality. Suppose that the size of batch is so adjusted that the average rate of testing with the 
sequential scheme when p p^ is equal to the rate of sampling with the deferred sentencing 

*87 

scheme. Since the average rate of sampling in the first case is 2*70 {b r 1) ^ per “batch", 

Po 

*87 

and the rate in the second case is 1 per “lot", the batch will be chosen to be equal to - - lots. 

Po 

The comparison of the schemes is not very simple, since it will vary with the magnitude of the 
change in quality and its duration. Moreover, whereas the properties of the sequential scheme 
are known fully and have been tabulated (Ref. 1), the response of a deferred sentencing scheme to 
changes in quality has not been investigated exactly. Some rough calculations have yielded the 
following result. If the changes in quality are not related to the batching, i.e. if there is no tendency 
for batches to be of uniform quality within themselves and for changes in quality to occur at a 
change of batch, the two methods of inspection are about equally effective. But if the quality 
goes with the batches, the sequential scheme is considerably more efficient. Since this possibility 
was specifically excluded in the conditions of §1, we may conclude that the decision between the 
two types of inspection scheme will be based mainly on considerations other than economy of 
testing. The same conclusion would probably hold good if the simple §5 scheme were replaced 
by any other kind of deferred sentencing scheme. 

16. There is a certain amount of difference in operation between the two types of inspection. 
In the deferred sentencing method, test items arc selected at regular intervals from the output 
and can be despatched for test at once. After the test sentence may be delayed for a certain 
number of lots. In the sequential method, test items are only selected when a whole batch has 
been completed. The number of items that will be needed in the test is not certain, and in some 
cases it will be economical to select and despatch to the place of test the maximum number of 
items that are likely to be needed, to avoid possible delays in sending for more— test items not 
used being returned. (The same point arises with scheme (ii) above.) The test itself automati- 
cally decides the sentence on the batch, and there should be no question of referring decisions 
back to "headquarters". 

Thus when the test is cumbersome or time-consuming, cither in itself or in its preparation, the 
two inspection methods may involve a different delay between production and sentence. The 
delay, and the cost of sampling, transport, etc., will have to be assessed for each method and 
compared. There should be no appreciable difference in the ease of operation of the methods 
— one of them is not more likely to lead to confusion than the other. The manufacturer may, if 
he wishes, construct quality control charts from the test results of either, though he will probably 
feel that the regular sampling rate used with deferred sentencing makes charting a little easier. 

17. But there is one consideration that, whenever deferred sentencing methods have been 
used, has far outweighed any of the above. In order to use a deferred sentencing scheme it is 
not necessary to have decided on the standards of good and bad quality before carrying out the 
test, and the standards may be altered at any time subsequent to the test without affecting the 
appropriateness of the test itself. With sequential inspection this is not so. The only way of 
testing individual batches that is independent of the quality standards is by "single sampling", in 
which a fixed number of items are tested per batch, without regard to the number of defectives 
occurring.* This is in general a considerably less efficient procedure than sequential sampling, 
and will clearly be less efficient than deferred sentencing, which in its simplest form of §5 is a kind 
of single sampling with an undetermined and moveable batching. 

It has been found that the consumer is often not prepared to make in advance any final state- 
ment of his quality requirements, a statement that until further notice lots will irrevocably be 
sentenced to such and such standards. On the contrary, he will accept the output as long as he 

* I am considering the ordinary known methods of sampling inspection: single sampling, double 
samplkig, and Wald sequential inspection. It would, of course, be possible to use some kind of sequential 
inspection procedure which did not itself involve giving a sentence (as the Wald procedure does), but it 
seems unlikely that any such scheme would be used in the present instance. 



206 Anscx)mbe, Godwin and Plackett — Methods of Deferred Sentencing in Testing [No. 2, 


considers the quality to be good, and if a deterioration occurs he will then consider whether his 
previous requirements should be relaxed, taking into account the available supplies, his needs, 
etc. The decision to accept or reject may be taken some considerable time after the change in 
quality has occurred, sentence being meanwhile suspended. Such an attitude of waiting until 
trouble develops before deciding what to do about it may be quite sound from the consumer’s 
point of view, but blocks any attempt the statistician might make to design a really economical 
sequential procedure of inspection. 


References 

‘ Anscombe, F. J., “Tables of sequential inspection schemes to test fraction defective". (In preparation.) 

* Barnard, G. A. (1946), “Sequential tests in industrial statistics", J. Roy. Stat. Soc., Supplt., 8, 1. 

" Dodge, H. F. (1943), “A sampling inspection plan for continuous production", Ann, Math. Stat,^ 14, 264. 

* Dudding, B. P., and Jennett, W. J. (1942), “Quality control charts", British Standards Institution, B.S. 

600R. 

‘ Scaly, E. H. (1943), “Rational sentencing of proof results". Ministry of Supply Advisory Service on 
Statistical Method (S.R. 17), Technical report QC/S/6. (Unpublished.) 

* Sp,alding, H. B. (1945), “Rational sentencing of fuzes". Ministry of Supply Permanent Records of Research 

and Development, Monograph No. 15.453. (Unpublished.) 

^ Wald, A. (1945), “Sequential tests of statistical hypotheses", Ann. Math, Stat., 16, 117. 

* Wald, A., and Wolfowitz, J. (1945), “Sampling inspection plans for continuous production which insure 

a prescribed limit on the outgoing quality", ibid., 16, 30. 

F. J. A. 


Part 11.— Derivation of Opfrating Characteristics 
First Method 

The operating characteristic of the simple deferred .sentencing scheme described in §5 above, 
and also that of the more elaborate scheme (ii) of §1 1, will now be derived, for the limit when the 
fraction defective p - .-0 and £)-*• X). 

For the scheme of §5, the problem can be put abstractly as follows. (To begin with, we assume 
p 0, D - X).) An array of points (of indefinite number) consists of “defectives" and “non- 
defectives" in random order, the probability of a defective being p. Integers n and D being given, 
whenever n consecutive defectives occur in a space of D points or less, all points from the first 
to the //th defective inclusive are “rejected". Points not rejected are accepted. We require to 
calculate the proportion of points accepted, on the average. 

As regards the simple mathematical properties of such a sequence of defectives and non- 
defcctives, it is easy to verify the following statements. 

(i) The chance that the /th defective beyond (i.e. to one side of) any given point should occur 
at the rth point is 

^ p^ 

where q ^ ] - - p. 

(ii) Given the position of the /th defective, i.e. given r, all sets of positions of the first (/ - 1) 
defectives are equally likely, and have probability 

(iii) It follows, for example, that the probability that the (/ — l)th defective is at the 5 th point 
given that the /th defective is at the rth, is 

r- (/ i < s < r — 1) 

V f — 1 

(iv) The chance that any given point should be a non-defective and lie in a run of precisely 
r consecutive non-defectives is 

r/? V ('• > 1) 

To find the required operating characteristic, we first find the chance that any point is a 
non-defective and is accepted. We have to consider the positions of the (// — 1) defectives on either 



1947] 


the Fraction Defective of a Continuous Output 


207 


side of A, Suppose the first defective on either side occurs at the points B and C, and let the 
next (n — 2) defectives beyond B occur at distances /*i, r*, . . . rn~% beyond i?, and similarly 
the next (n — 2) defectives beyond C occur at distances 5a, . . . 5n^a beyond C. The 
notation is shown in the diagram, where a heavy dot indicates a defective. 


. • . ...... 

C 

A 

B 



5rt_3 5 a 

distances from C 

•Vi 



'•i 

/*a 

distances from B 


Then if the number of points from B io C inclusive is D — a{a an integer, « — 2 < a < D — 3), 
the point A will be accepted if simultaneously 



Sn 

% ^ 

""I 

1 + 

Sn- 

- 3 > 


*2 •f- 

Sn 

4 > 

1 

'n — 3 “I" 

Si 

> 

" 

■«— 2 


> 

a f 


Forgiven values of a, r„ _ 2 , and 5„ _2 (both the latter being greater than a), we can find the chance 
that the remaining inequalities hold, by induction, using (ii) and (iii) above. We then apply 

(i) to find the chance that the points between B and C will be accepted, whatever the values of 

. 2 and 5/i - a, a being still given. Finally we apply (iv) to reach the required chance that A is 
a non-defective and accepted. The summation involved in this last step is very tedious. 

A slight modification of the preceding working yields the chance that any point is a defective 
and not rejected. Adding the two results together we have the operating characteristic of the 
scheme. 

In practice we are usually interested in small values of p and large values of D, and the calcu- 
lations become far less tedious if we set Dp ~ u and lei D -> oo, p - > 0, at the outset. We can 
then reframc the problem as follows, 

A line (of indefinite extent) contains certain points, known as “defectives", distributed 
in a uniform probability distribution, at the rate of u defectives on the average per unit 
length of line. An integer n being given, whenever n consecutive defectives occur within 
an interval of unit length, the part of the line from the first to the /7th defective is “rejected". 
Any portion of the line not rejected by this rule is accepted. We require to find the pro- 
portion of the whole line accepted. 

Our previous statements (i) to (iv) now read: 

(i) The chance that the tih defective beyond any given point O should occur at distance xt is 

(f 1 1)1 (0 < Xt < 

(ii) Given the position xt of the tih defective, the joint distribution of the distances beyond O 
of the first (/ — 1) defectives is 


(t- 

x} 


dxi . dx2 


. . dxt^ 


(0 < Xi < X2 < 


< Xt) 


(iii) The distribution of xt-u given xu is 


(t -\) dxt-i (0 < < Xt). 

Xl 

(iv) The chance that any given point lies between two consecutive defectives spaced x apart is 

X dx (0 < < CO). 



208 Anscombe, Godwin and Plackett — Methods of Deferred Sentencing in Testing [No. 2, 

We can derive the chance that any given point A is accepted by the method already outUn^. 
We need only consider the case when ^4 is a non-defective point, as the chance that it is a defective 
is zero. Suppose A lies between two defectives B and C distant 1 — a apart (0 < a < 1 ). Let 
the distances of the next n — 2 defectives beyond B be . . . Xn^ s« Xn- 2 = and those 

beyond C be yi. . . . yn^ 3 , yn- % - Y, say. Then for given JT and T (> a), it is easy to prove 
by induction that the chance that the following inequalities hold simultaneously 

Xi 4-yw-3>a| 

Af 2 + Yn -4 > a ! 


Ar/,-3-|-y, >a 


/z - 3 . f r 

_ 2 yn-a ^ 


The probability that the stretch BC of the line will be accepted is the expected value of this quantity 
as X and Y vary, in the distribution given at (i) above, i.e. it is 


Moc. «) r[i.- « - - 3 ± l' - n W- r, 

yotyaL «-2 (XY)” ^ . J [(« - 3 )!]* 


_ r f" JC"-’ 

Ly »» (« -- 3)! 


e-’‘dx 


* (xu)” ■ 

"in - 2)1 (n - 4)! 


I l-“« 1 21 • 


"" [( 


I I I , 

1 -f a// , 

2! ^ ' (w-3)! 


^70 r rtj 

, / / Cv + y — aLii)*^-^e-(’i+y) dxdy 

' J OtM y J.H 

+ i : sj - “4,,,' ^ 4„ /. «' + 


-- c'--a" r 1 + 2(a//) 1- 2‘ 4- . . . + 2"-’ 

L 2 ! {n — 3 )! 

11 :i 

J («-3 + /)!j 

. This is for // > 4. For n 3 we have simply 


p(oc, //) -- 

and for // - 2, p(a, u) — 0. 

If the points B and C are more than unit distance apart, clearly the stretch BC will be accepted. 
The chance that the point A will be accepted is therefore, from (iv) above, 

^.or 1 

/ ^ w* .V e dx + / ^ X p {\ - x, u ) dx 
— (1 ^ I- e-** (w — t)e^ pi \ u\ dt .... (1) 

Now 

~ 1 “ ^ + l^'" + ') + w + (»• - Dj! + • 



1947] 


the Fraction Defective of a Continuous Output 


209 


Thus the desired operating characteristic (i.e. chance that any point is accepted, or proportion 
of the whole line accepted) is, when /i > 4, 


Pn(.u) = {(- 


fo+ 1) + (C'. + \)u}e~« 

( u^ 

+ |<•o + t-i « + «•• 21 + . • • + <•««-. 


(2/7 - 6)1 J 


e-2u 


( 2 ) 


where the coefficients Cq, Cj, . . . are found by writing down the coefficients occurring 
in p(oL, «), namely 


1, 2, 2^ . . . 2^*-®, and (n — 3) further quantities (I < i < // ~ 3) 

n-3 r/Q n 3 | ^ n ^ + _ (,j _ 2 - /) »»' ^ + ^0 , . (2') 

and adding them to get r'l, multiplying them by (1, 2, 3, . . . 2n ~ 5) and adding to get c*o, 
multiplying by (0, 1, 2, . . . In - 6) and adding to get c\, multiplying by (0, 0, 1, 2, . . . 
In - 7) for c* 2 » etc. This form of the answer is convenient for numerical purposes. If we express 
the powers of 2 in the set of coefficients (2') as binomial expansions of the same powers of 
(1 4 1), and make repeated use of Lemma 2 in Godwin’s solution below we can deduce his 
simpler-looking expressions. 

For // “ 2 or 3, the evaluation of (1) is immediate. We can thus tabulate some operating 
characteristics, as follows: 


n - 2 
n - 3 
/7 = 4 

// 5 

n ^ 6 

n - 7 


(1 4 - u)e-^ 

2u e « -i- e -‘^^ 

(- 7 + Su)e-» -I- (8 h 4m -|- 

(-42 I 14Mk-« I- (43 [- 30m + 9m‘ f ^ + j'j 

40 '>3 

(- 198 I 42u)e « f (199 h \5Su + 59i/“ |- ^ u^ + u* H 
( - 858 I 132w)e » ^ (859 ]- 728// 299«’* 




, 235 , , 173 . , 23 ,, 
+ 3 M^ 1 ,2«‘ I ,2"' 


^ ^ w® -j- ^ //' }’ ^ u^)e 

12 ^ 90 2880 ^ 


The operating characteristic of scheme (iii) of §11 may be found similarly, but now the chance 
that a point A is accepted, given the positions B and C of the defectives next on either side, no 
longer depends simply on the distance BC, but on both distances AB, AC separately; and the step 
corresponding to equation (1) involves a double integral. 

To turn now to the scheme (ii) of §11, we can give an abstract formulation for the limiting 


case 0, as follows. We have set Dp - ^ ~ 

A line (the “testing” line) contains points known as defectives, distributed in a uniform 
probability distribution of average rate one defective per unit length of line. To each 
point on the line corresponds a point on a second line, the “production” line. In the 
absence of defectives, an interval on the testing line corresponds to an equal interval on 
the production line, but when a defective occurs in the testing line the rate of correspon- 


dence is changed to 1 


1 

•'A:’ 


for intervals following (i.e. to the right-hand side of) the defective. 


After an interval in the testing line of length c clear of defectives, the rate of correspondence 
returns to 1 : 1, until the next defective is encountered. Whenever n defectives occur 
within a unit interval, the part of the production line corresponding to the stretch from 
the first to the //th defective in the testing line is rejected. It is required to know what 
proportion of the whole production line is rejected. 


SUPP. VOL. IX, NO. 2. 


p 



210 Anscombe, Godwin and PLACKExr— of Deferred Sentencing in Testing [No. 


Consider any point A on the testing line. If it lies between two consecutive defectives spaced 
X apart, the corresponding interval on the production line is of length 

f \r o<x<c. 


^ U — c) if x> c. 
k 


The average rate of correspondence between the production and testing lines (ratio of correspond- 
ing intervals) is therefore, from (iv). 


{/; 




*xe~<‘’‘dx 


(3) 


while the average acceptance rate (expected acceptance on the production line per unit length of 
testing line) is, from (1), if c < I, 


{/a- + ^ “ i)] I ^ i ~ i) ] 

This is easily reduced to the form 

I Pn(u) + ^1 ■ - C)^ (4) 

where Pn{u) is as defined in equation (2). Dividing (4) by (3), we have the operating characteristic 
of the scheme. If c : > 1, the factor Pn(n(i - c)) in (4) is replaced by unity. The average rate of 
sampling, as a multiple of the minimum rate for no defectives (« — 0), is the reciprocal of (3). 

F. J. A. 


Second Method 

An alternative method of finding the operating characteristic of the scheme of §5 will now be 
described. 

To determine the probability of acceptance of any lot we need only consider the (2£> — 1) 
lots of which it is the middle one and, in detail, only the positions of the (« — 1) defects nearest 
to it on either side. The probability of acceptance with k of these lots defective is p^(l — 
multiplied by the number of arrangements of the defectives which permit of acceptance. This 
number is 

^ 3/i-jQ for 0 < A < A? — 1 (1) 

k-n r - I 

r-*o —1 

-H i: 'v (*->• »-» cj «+'»Ct 

r ni - o 

r 2 

4- £ ’•-> C,„^, - *-»C, for «< /t < 2« - 3, . (2) 

m - -- 1 

A — 1 

where for odd A the second term is summed twice for A — /i + 1 < r ^ , and in the third 

A 4- 1 L- 

term we have r -= ^ while for even A, the second term is summed forA— /i + l <r<^ — 1 

^ 2 



19471 


the Fraction Defective of a Continuous Output 


211 


and again for k — n f 1 < r < , and in the third term we have r 



Finally 


n~2 


I {2 "C„,+ i <— + (k-2n + 3) P+m q 


for 2n ~ 3 < A: 


. ( 3 ) 


In these expressions the binomial coefficient is defined for all positive, negative or zero integers 
a, />, as the coefficient of in the expansion in ascending or descending powers of A" of (I + XY. 

We now sum the probability expressions over the admissible values of A'. If/7^0, pD->u, 
then the limiting values of the probabilities are 


where 

and 


fin-iC 

-M _ 

n 


It -3 . , 


' tl 


2n- « 


0{A, //)//*, 

k\ 




</.(A, //) = 2^ — for- 0 < A < /; — 1, 


V.(A, /I) - 1 + - (2/1 - I - A).*Cn 

f - M f I 

- ^ *C„-. , H ^ .‘'•-‘C, for /J < k < In - 6. 


The method of obtaining these results is indicated below : we first establish four lemmas dealing 
with the transformation and summation of expressions involving binomial coefficients. 


Lemma 1. 

i~b~- a 

The left-hand side is (coefficient of A"*-*^^** in (1 -} jc)«(l .v)^). 

Lemma 2. 

i ^ b — a 

The left-hand side is S (coefficient of X^ < c-i-d i) (i _ 

Putting b or d zero leads to 

Lemnui 3. 

i = 0 

Using lemma 1 write the left-hand side as 

V 2: kC, -Ci^J - S L 

i j i j 

“ ^ f-j ^Cj^fn ~ S ^Ci^j ^ 

3 J « 

= S (coefficient of (jc>'y+i “•*+»»» in (I -f yy"(l H- x -j- y f xyY), 



212 Anscombe, Godwin and Plackett — Methods of Deferred Sentencing in Testing [No. 2, 


Lemma 4. 


Consider the sum 


« H l m-}-2-Pi 

S £ 


n -\- r - pt - 


* £ 


'/i+r+J-p, - ... - 


where 0 < i < i + 1 . (Here and elsewhere below the bracket notation for binomial coefficients 
is used for greater legibility.) 


Instead of summing over all the p's we make j of them 2 :ero and only sum over the remaining 
ones. The sum of the expressions so obtained by choosing the j in every possible way we 
denote by /(//, r, y, s, i) and show that it is equal to 




We may group the 'Cj expressions according to the number of zeros at the beginning of the sequence 
pi, Pa, ... . If there arc / - I of these, and we sum over pj-i-fi, then the sum of the 

expressions in this group is 


Hence 


it \ j - i\ i 

^ /(« \ j “ / f 1 ™ p. A* — / h / — 1, /, A’, /). 

P-^ I 

/(«, rj, s, /) -= ^ f(n ]- j - I + I - p, r — j + I ~ I, v, i). 

p - I 


Now /(//, y, /, .V, /) and the result follows for r - y f l,y -f 2, . . . by induction. 


The various arrangements of defectives permitting of acceptance may be grouped as follows : 
(a) k < //. 

(h) k > 2n - .3, r defects in lots 1,2, . . . £) — 1, r</j— 1, 

W „ » r > /? — 1, 

id) n ^ \ k In -- 3 „ r < /: -f- 1 — //, 

(e) „ r defects in lots 1,2,... D, r > k + ] — n^ 

if) M (/• -- 1) defects in lots 1, 2, . . . (O — 1), one defect in lot D, 

r k + 2- n. 


Assume 2r : k in (a) to (e), 2r ", k I 1 in (/). 

(a) All arrangements arc acceptable and their number is 
ih) If /• ~ 0 the number of arrangements is 

If r 0, let the r defects in the first (/) — 1 ) lots be in lots Oi, Oa* . . . Of. Let there be Pi 
defects in lots r/,_i D to ai i- O — 1 inclusive. Then we must have 0 < pi + p- -f . . . f- 
Pi ’ // — 2 “I / “ r. Hence the number of arrangements is 


l>-k \ u - i 

V 

n 2~pi - . . . 

I 


I) -k + n /> — E f M -f r -- 2 ?/ — 1 ~ r 

£ . . . £ £ . 

+ l — Pi’’^ 


-Pr 


l\ /«2-a,\ / D-Or \ 

. Pi / V P» / • • • Pi PtJ 


The limits of summation for the p’s follow from the conditions above ; those for the a’s are the 
widest compatible with acceptance of the configuration. We first sum over the a’s,. using lemma 2. 
In order to avoid getting two terms at each summation we require alt the p’s except p, to have 



213 


19A1\ the Fraction Defective of a Continuous Output 

least value 1 . Hence we treat separately the zero values of the p's by means of lemma 4. If 7 
p’s are zero we have 

/(„ -> r - 1 p,, /• - 1 , 7 , 0 , 0 ) 

Pi=-0 

^ hC^_j r-lQ 

again using lemma 2. We now sum this over /* from 0 to (/• — 1 ). 

(c) Since only the arrangement of the (n — 1) defects nearest to the middle lot on cither side 
affects the acceptability of the configuration, we have for r ^ // — 1 -h / an extra term * 
in the summation. Also we replace r by // — 1 + / and have as limits of summation Pi = 0, 
0 < Pi “T Ps r . ... h Pi < / - 1 (2 < I < /I — 1), / - 1 - 1 : Qi < D — k 1 // - 1, etc. The 

value of the sum is ’2Q /(O, w — 2, 7 , 0, 0) - , This is 

n 

summed over 7 from 0 to n - 2 . 

(ci) When k < 2/2 — 3 we need not perform more than k — n I I summations, since if there 
are r defects in the first D lots, the positions of the n - /• — 1 after the Dih lot are immaterial and 
only the positions of the k -- n \ after that need be considered. Hence the number of sum- 
mations is the lesser of r, k ~ n 1-1. For r < A: — n }- 1 we proceed as in (b) and the same 
answer arises. 

(e) If r > A — /2 4 - 1 a further term appears, being the number of 

arrangements of the remaining r — (A — -f- 1) defects before lot (D -f 1). We transform 

ID~aic~n^i\ / D \ 

\r-A-f/ 2 -lM^ ■■'‘' Pi' -V 11 / 

r ! - N — /• - 1 

Y /r^n-k--2\ /k «nW £>--«/!. !»+ 1 !( 

^ \ / /\ /■ / \«-l -Pi . • . - P*. »H-i/ 

«=o 

by lemma 3. We then proceed as before, getting 

— A:-] k—n /? r--J 

2 r^n-k-iCi i: ^ + | - - p, A ~ W, 7, 1,/) 

i^O j -0 p 0 


N” 


i — (» 


k -n 

^ f + - 1 ( 7 - />> l-A -«-y+iQ 




(f) If we fix definitely that the Dth lot is defective and that there are (/• — 1) defects before it 
we arrive at the sum 


/>fw k-l 

S . . 


J) -\ w-r-l 

i: 2 : . 


k-r l-pi- . . . "Pk-n 

2 : 


tfi = l rt/t— « -f I — 1 Pi 0 


PA’-N + I (> 


fD 1 — ^ i\ ^ ^k-n^\ 

\ r+n-k-2 ) \ pi / . . . \k--r- p,- 14 , 

which gives 


r+n-lr-l 

s 

t -0 


S r + n-A-iQ 

y -0 


/t-nQ 


^-rC,-n-iV 


We now group these expressions according to the value of /w in ^ or ‘ ’'‘Q. -i, simpli- 
fying by summation in some cases. By subtracting case (/) from case (e) we obtain the number of 
arrangements in these groups when the Dth lot is non-defective. 

We now, for given A, add all the expressions for the various values of r. Since we have assumed 
in the proofs {a) to (e) that 2r < A, and in (f) that 2r < A + 1 , we must use the formulae only 



214 Anscombe, Godwin and Plackett — Methods of Deferred Sentencing in Testing [No. 2, 

for this range ; from the symmetry of the conditions the number of arrangements for r defectives 
in lots 1 to — 1 is the same as for k — r. 

The results for the limiting case follow since 

Lt (1 — -A when m = o (/)). 

and Lt pHi - - 0. 

Further summation and simplification follow on using lemma 1 . 

H. J. G. 


Two Particular Schemes 

In order to show how an operating characteristic can be determined when the above theory 
does not apply, two particular schemes will be considered. The method of analysis of Scheme 2 
is due to Godwin. The computations have been done by Godwin, myself, and the computing 
section of the Mathematics Division, National Physical Laboratory. 

Regarding Scheme 1, the reader will note that it differs in three respects from those so far 
discussed in Part II : 

(i) the number of batches covered by a rejectable cluster of defectives is small, and 
the fractions defective arc not treated as though infinitesimal ; 

(ii) more than one item is tested per batch ; and 

(iii) the rejection rule, when it operates, causes the rejection of a fixed number of 
batches forward from the batch corresponding to the first member of the rejectable cluster 
of defectives, instead of only from the first to the last member of the cluster (a possible 
feature of the rejection rule that was mentioned in §12 of Part 1). 

It would have been interesting to have tested the effect of each of these modifications separately. 
Unfortunately it has not been possible to carry through a suitable programme of computations. 
J. P. Burman has worked on schemes in which modifications (i) and (ii) but not (iii) arc made, and 
has used a method of enumerating acceptable defect-combinations (instead of rejectable combina- 
tions). His first mode of classifying the combinations is by the number of consecutive batches, 
including the one under consideration, that showed no defectives in the lest. 

Scheme 1 

Test a sample of 40 from each batch, and set out the results as in the table below. 


niimbcr. 



101. 

10:2. 103. 104. 10,'i. 

JOG. 107. 

J0f<. 

109. no. in. 

1J2. 113. 114. UT). 

nc. 117 . 

ii» 

Number of de- 









fectives in the 
sample 

. 0 

O 

o 

o 

1.0. 

2 . 

0.1.1. 

1 . 1 . 0 . 0 . 

1.0. 

0 


Number of de- 
fecti ves in 
previous five 
samples, in- 
cluding the 

present one .2.1.2.!. 1.2. 2. 3, 3. 4. 4. 5. 4. 4. 3. 3. 2.1 

As long as less than 5 defectives appear in every set of five samples, as shown by the number 
in the bottom row of the table being less than 5, accept the batch four in arrear of the last batch 
sampled, i.e. after sampling batch 105, batch 101 may be accepted. If the number in the bottom 
row at any time is 5 or more, reject all batches back to the one in which the first of the five or 
more defectives was found, accepting any others that may remain unaccounted for. After reject- 
ing a group of batches, continue sampling as before, rejecting if a number in the bottom row of 5 
or more again appears, and otherwise suspending judgment until five further batches have been 
sampled. Thus in the example given, when batch 111 has been sampled, batch 107 may be 



19471 


the Fraction Defective of a Continuous Output 


215 


accepted. But when batch 112 has been sampled, batches 108, 109, 110, 111 and 112 must be 
rejected. Numbers of 5 or more do not appear again in the bottom row, so that as batches 113, 
114, 115 and 116 are sampled, judgment is suspended. When 117 has been sampled, 1 13 may be 
accepted, and so on. 

To analyse this scheme, suppose the number of defectives in the /th batch is 

ti = m + m-i + /fi-2 + wi~3 + //i-4 

is the number of defectives in the five batches preceding and including the /th. Then the /th batch 
is rejected if and only if 

(1) /-i> 5 

(2) n < 5 (a) r, > 5 ; tu + w. - 1 ;l- /it ~ 2 + Wi- 3 > 0 

(h) ri., i< 5, tiy 2 > 5; rti 4- Wi-i + > 0 

(r) < 5, ri , 2 < 5, 3 > 5; /n -f Wi-i > 0 

(d) ri + 1 < 5, ri^ 2 < 5, ri+ a < 5, r ,>4 > 5 ; /?,• > 0 

and is otherwise accepted. 

We now enumerate the various possibilities specified in (2), find the probability of each, and 
of (1), and sum. A systematic method of doing this is to consider the group of batches which 
include the /th and which together account for 1, 2, 3 or 4 defectives; the possibilities for the 
remaining batches, considered singly, can be detailed in sections corresponding to these numbers. 
The layout is best seen by enumerating the cases under 2 (/>). 


3 

3 

3 

3 

2 

2 

2 

1 

1 

0 

2 

2 

2 

1 

1 

0 

1 

1 

0 

0 


O'- 3). 

0 

0 

0 

0 

1 

1 

1 

2 

2 

3 

0 

0 

0 

1 

1 

2 

0 

0 

1 

0 


Defectives 
Ba(>ch iiiiinbiM 
(t — 2) and (t — 1) and i. 


2 

2 

2 

2 

2 

2 

3 

3 

3 

4 


(*:4 I). 
3 
2 
1 

0 

2 

1 

0 

1 

0 

0 

2 
1 
0 
1 

0 

0 

1 

0 

0 

0 


2 

3 

4 
2 

3 

4 

3 

4 
4 
1 
2 
3 
2 
3 
3 
1 
2 
2 
1 


The table below indicates the operating characteristic of this scheme, and also that of a similar 
scheme in which rejection occurs when 4 or more defectives, instead of 5 or more, occur in 5 
consecutive samples. 


PiopDrtlon 

}*robabHity ol ttr<M‘|)(aiice 

I*robabllIty of ucceptanctj 

def^tive. 

(crikTion of r> rejects). 

(criterion of 4 rejcctB). 

i7o 

99-00% 

95-51% 

17„ 

87-85% 

72-35% 

4% 

01-82% 

00-65% 

5% 

00-28% 

00-09% 



216 Anscombe, Godwin and Plackett — Methods of Deferred Sentencing in Testing [No. 2, 


Scheme 2 

This is illustrated in the table below. 

Batch mimhcr. 

101 . 102 . 103. 104. 10r». 106. 107. 108. 109. 110. 111 . 112 . 113. 114. 115. 

Number of defectives in 0.0. I, 0.0. 0.1 .0.1 .0.1 .0.0.1 .0 
sample of 20 

Number of defectives in 

sample of 30 . ._.~.l.0.0. — .0.1. 1.0. 0.0. 0.0.0 

Number of defectives in 
previous five samples, in- 
cluding the present one .0.0. 2. 2. 2. 2. 3. 2. 4. 4. 5. 4. 3. 2. 2 


Sample 20 from each batch until such a time as one or more defectives are found in the sample; 
then test 30 more of that batch and of all succeeding batches untiUwo consecutive samples of 50 
yield no defectives, after which continue inspecting samples of 20. As long as less than 4 defectives 
appear in every set of five samples, as shown by the number in the bottom row of figures being 
less than 4, accept the batch four in arrear of the last batch sampled; after sampling batch 105, 
batch 101 may be accepted. If the number in the bottom row at any time reaches 4 or more, 
reject all batches back to the one in which the first of the 4 or more defectives was found, and 
accept any batches that may remain unaccounted for. After rejecting a group of batches continue 
sampling as before, rejecting if a number in the bottom row of 4 or more again appears, and other- 
wise suspending judgment until live further batches have been sampled. Thus, in the example 
given, when batch 108 has been sampled, batch 104 may be accepted. But when batch 109 has 
been sampled, batches 107, 108 and 109 are rejected, while batches 105 and 106 are accepted 
because they icmain unaccounted for. Batches 110, 111 and 1 12 are rejected as they are sampled, 
and onwards from batch 1 13 judgment is suspended for five batches if the number in the bottom 
row does not again rise to 4 or more. 

The conditions under which a batch is rejected are specified as in the analysis of Scheme I, 
with the alteration that the number of rejects 5 is everywhere replaced by 4. It remains as before 
to find the probabilities of the various cases and it is here that the difficulty arises, namely, that 
the sample size varies from batch to batch. The first problem is therefore to determine the pro- 
babilities of various sample size configurations. Before sampling from any batch, one of the 
following conditions must hold : 

ia) We shall take 20 from the batch and only take more if there is a defective in the 20. 

{h) We shall take 50 from the batch and, if there are no defectives in the 50, take 20 from the 
following batch. 

(r) We shall take 50 from the batch, and also from the next batch. 

Let the probabilities of these conditions holding before the iih batch is sarppled be /7i, i, p^, u 
Ps, X respectively. Let the probability of 0 defectives in a sample of 20 be /;, and k the probability 
of 0 defectives in a sample of 50. 

Note now that condition ia) holds when the previous sample was 20 and contained no defec- 
tives ; or when the two previous samples were 50 and contained no defectives, and the one before 
that contained defectives. 

Hence - h.p.^i ^ k'^lp.i ,(l /7)+(l aXl-A)] . . . . (1) 

Condition (h) holds if the previous sample was a 50 with no defectives, and the one before that 
contained defectives. 

Hence p,. i - klp,j ,(1 - h) + (1 - Pu / ,)(1 A)] (2) 

Condition (( ) holds if the previous sample contained defectives. 

Hencep,., (I A) 4 - (1 - p,. i-,)(l - A) (3) 

Consider first equation (1). In this put pi, ,• -- + m. 

Then m - hui. , -f A«(A - A)wi _3 h w[(l - h) -|- kHh - A)] -■= A^l - A). 

Put m - AV[A* 4 (1+ A)(l - A)], then 

Ui — A//i_, + k^ih ~ k)ui^ » 0. 



1947] 


the Fraction Defective of a Continuous Output 


217 


But since the roots of the equation jc* — /rjr* + k\h — ~ 0 are all less than 1 in modulus 

(since I > h > k >0) it follows Ui -> 0 as / oo. 

Hence, as / -> oo, pi. t k^llk^ + (1 + /:)(l - h)] 

Pt, i->P2 -- kil - h)l[k^ h (1 + ^)(l - //)] 
i -> P3 " (1 ~ h)llk* 4- (1 -f ^)(l - //)] 

All the configurations of defect numbers necessary for acceptance in the nine consecutive 
batches of which the given one is the middle one are now enumerated. By multiplying the pro- 
bability of accepting after a given initial state by the probability of such a state, and summing, 
the required probability of acceptance is found. 

The following results have b^n obtained : 


rroportioii doffctlvc. 

Probability of accoptaiicr, 

i% 

95-25% 

1% 

70-63 % 

4% 

00-68% 

5°; 

00-12% 


R. L. P. 



218 


LIND1.EY — Regression Lines and the • 


(No, 2, 


Regression Lines and the Linear Functional Relationship 
By D. V. Lindley, B.A. 

(Communication from the National Physical Laboratory) 

Abstract 

The first half of this paper solves the following problem: if there is a linear multiple regression 
between n variates, under what conditions will the regression continue to be linear when the 
variates are influenced by error? The new regression coefficients are obtained in terms of the 
original coefficients. 

This leads in the second half to a discussion of the use of regression lines and functional 
relationships in statistical methodology. In the case of norrhal distributions the problem of 
estimation is discussed at some length. Since the classical work has used least squares methods 
a section relating to this work is included and some criticisms offered. 


1 . The EJfect of Errors on Regression Lines 

1.1. A random variable , i, called the dependent variable, has a linear regression on a 
number of other landom variables It. ... In. the independent variables, when, for any fixed 
set of values of the independent variables, the dependent variable is distributed about a mean 
which is a linear function of the values of the independent variables. This I write as 

In 1 -- -f- p 

1 I 

where %i and p are constants. By suitable choice of the origin of coordinates p can be n\ade 
to vanish and I am left with 

(1) 

i 1 

It is well known that the presence of errors in the measurement of In ^ t, provided thev are 
unbiased, will not affect the determination of the regression line in so far as the estimates of the 
coefficients will continue to be valid ; only their standard errors will be increased. However, if 
similar errors are made in the measurement of (/ 1,2,. . . /?), unbiased or not, the estimates 

will eease to be valid and we shall not get a true picture of the regression of j i on the remain- 
ing S/s. The effect of errors in the dependent variable can be removed by increasing the size 
of the sample : the effect of errors in the independent variable can only be removed by using a 

different method of approach. It is the investigation of this latter effect which occupies the 

greater part of this paper. 

1 . 2. I formulate the state of affairs as follows : 

5i, It. . . . 5/, M are n i 1 random variables with the joint probability density function 

. . . 5m j i) such that (1) is true. 

A5i, A5.., . . . AE;, ^ I arc // ! 1 independent random variables with probability density 
functions ^t(A5i) with zero means: furthermore they are independent of the 
I consider the random variables 

.V, - 5, r AJi (i - 1, 2, . . . w + 1) (2) 

and denote their joint probability density function by .r 2 , . . . Xn i). 

Two questions immediately arise: is the regression of Xn i i on the remaining .Vi's linear, 
i.c. will 

n 

Xn j. X = 2: OiXi + b . 

t_= 1 

and if this relationship holds what will be the value of the coefficients a,- and b? 


( 3 ) 



1947) Linear Functional Relationship 219 

1.3. Assuming that (3) does hold it is not difficult to find these coefficients. I suppose for 
simplicity that n is equal to 1 so that (1) becomes 


This may be written 


52 - 


5=) dl, - a5,/7r(5„ 52) dl„ 


and on multiplication by 5i and integration over 5i this gives 


[All ajXjo 


(4) 


where are the moments about the mean of the joint distribution of 5^ (there is obviously 
no loss in generality in assuming the means of zero— a point I shall return to later, § 1.5). 

If ii'ij are similarly the moments of Xu and the moments of A ;2 I have 


^^'20 (J .20 + \ 

M-Si— M-ii j 


by the independence of and A;;. 
Substituting (5) in (6) 


Accordingly if ^ axt, just as I obtained (4) 1 obtain 
IX',1 «[X'20 


JXu ^((Xgo -h (^2) 

and eliminating fXn between this and (4) I have 

3^(A20 = «(M-20 + (^2) 


( 6 ) 

(7) 


This, then, is the connection between a and a and clearly they arc only equal if [I 2 ~ = 0, that 
is when no errors are present. In general if a > 0, a ; * 0 > 0 so that the slope of the regression 
line tends to be decreased. 

However, in deriving (7) 1 have assumed that the regression of .v* on Xi is really linear. This 
assumption is not trivial, in fact it is not immediately obvious that it would ever be possible for a 
given distribution of Us to arrange suitable distributions for the A5i's so that the assumption 
held. To see what sort of conditions would have to be imposed in order that X 2 — 1 generalize 

equation (4). Just as (4) was obtained by multiplying the previous equation by 5i and integrating 
over 5i; by multiplying by 5/(r > 1) and integrating over 5i 1 obtain 


// 5/ 5, ttC?., 5,) dl,d^.2 = oiff Tr(5, 5,) c/5, 


i.e. 


t^r, 1 4- 1, 0 


( 8 ) 


This may be put in a more convenient form by the use of cumulants rather than moments. 
The bivariate cumulants are defined by 


exp 


/!y! 


) 


differentiating with respect to ti and putting t% ^ 0, 


exp (2x;„ Xlij, 


/,'V 


i!y! 


V. 'l* 

and again putting t^ = 0, 

/ i -1 

. . . . 

. (9) 

(10) 


Now from (8) it is clear that the right-hand side of (9) is a times the right-hand side of (10), 
SO that from the left-hand sides of (9) and (10) 1 have 


Vvr. « V,^ 

•f" i\ “■ “ 7 17^ 


ni) 


giving 


1 Xf + l, 0 


(r > 1) 



220 


Lindley — Regression Lines and the 


[No. 2. 


These conditions are thus necessary for the regression of 5* on 5 1 to be linear. The generaliza- 
tion of (5) is (sec Theorem II below) 

^V+J. 0 0 + ^ (J2) 

>«'r, 1 ^ 1 / 

in an obvious notation: (11) applied to the regression of x* on Xi gives 

^ r, 1 ^ I) ' * * ' • • (i^) 

as a necessary condition for -= axi. Combining (1 1) and (13) with (12) 


whence 


^r,! — a(y-r+ 1. 0 "l‘ 1) — o 

I 1 (a — d)ycr hi, 0 


(14) 


(14) thus expresses a necessary condition for the regression of x^ on Xi to be linear whenever 
•the regression of Sa on 5i is linear. Under certain conditions a reversal of the argument will 
show it to be sufficient. One point that might be noticed is that (14) does not involve the cumulants 
of A $ 2 , so that the opening remark that unbiased errors in the dependent variable will not affect 
the regression is verified. 

1 .4. The above method of analysis has required the existence of all the cumulants of all orders, 
and the proof of sufficiency mentioned will require a distribution to be uniquely determined by 
its moments. Such requirements do not give very great generality, and it is natural to search 
for a more widely applicable analysis: this is provided by utilizing the ideas of characteristic 
functions (c.f.) and cumulative functions (cm.f.). I use the notation 


. . ./exp . 


n)dKtd^, 


. din 


(15) 


for the characteristic function of Ci, ?2 . . . where Tr(§i, $ 2 , . . . ?n) is the marginal distribu- 
tion of lu li, , . . i.e. 

^(?i, 5., . . . 52,... 5.. 

The cumulative function is then defined by 


exp V'l. • • • Ol ^ 2 , . . . tn). 

9x are defined similarly for p(xi, x.., . . . X;,). 

Using these ideas a necessary and sufficient condition for the regression of Xn-{.i to be linear 
whenever that of + i is obtained. The proof only requires the functions to be suitably respect- 
able so that operations of reversal of order of integration and differentiation under the sign of 
integration are valid. Such requirements are more easily satisfied in statistics than those needing 
the existence of cumulants. 

1.5. One point might be mentioned before giving the proof. Just as the regression of i 
was written in the homogeneous form (1), so the regression of ,, (3), can be written 

n 

Xn \ i ^ 'L UiXi (16) 

* -I 

Without any loss of generality. For suppose 

/5iTr(?,, 52, . . . 5 m ^0 (/ = 1, 2 /I + 1). 

that is the grand mean of 5* vanishes: this can be secured by change of coordinates. Then 
p 0 and 1 have (1). But from (2) it follows that the grand mean of Xi will also vanish so that 
h 0 and I have ( 1 6). Accordingly I suppose both the regressions are in the homogeneous 
form. 


2. Solution of the General Problem of the Continued Existence of the Regression Line 

2.1. For simplicity in notation I give the proof for /? = 2; the reader can easily convince 
himself that no essentially new analysis is involved in the more general case. 



1947] Linear Functional Relationship 221 

Suppose then I have the state of aflairs given in § 1.2 above. It immediately follows that 
p{Xi,x,,x^)=f/fg,lxi - li)g,(x, - i,)g^(x^ - 
whence fx^pixi, x,, Xa)dXa 

5., l,)dl,dU^, 

^ t(xi — l,)g,{Xi - l,)dtid^t 

-//(a.Ci + «*?,)? lU, - lt)g^(x, - 
Similarly . Jpix i, Xt, X3)dx3 

Now if the regression of is linear 

fx,p(xt,Xt,X3)dx, 

fpixi, Xi, Xa)rfjr, 1 ''2-' 2 

and so I have as the condition 

//(ai5i + !*^il,)gt(xi — 5i))fi(As — li)dlidl, 

= //{OiXi aiX2)gi(xi - l,)gt(xs - i,)Tz(.S,„ls)dKidli . . ■ (17) 

To put this in a more convenient form I multiply by exp (itiXi + it 1 X 2 ) and integrate with 
respect to Xi and Xi. 1 denote by G/r^, the c.f. of xj for a fixed 5j (J ~ 1, 2) so that 

Gj^tj, ^j) - fexp{ir,xj)g,(Xj — lj)<lxj 

and hence 

8,. ^j(fp 5j) ■---ifexp{itjXj)Xjgj{Xj - lj)dxj 1,2) 

whence (17) becomes 

5i)Gj(G, ;2)Tt(5„ ^2)dZ,dl2 

«.//»! 5.) . Ga(G, 5a)7t('„ l.m,dl2 

Ml 

-h«a//C,(/.. 5.) ;a)7r(C,. .... (18) 

By taking expectations of both sides of cxpdfjXj) * Q\p {itjlj -h which is obtained 

from (2), it follows that 

Gj{tj.lj) = cxp{itjlj \ KjUj)) 0* --l,2) .... (19) 

and hence 

8*. Gj(t„ ii) = exp {itjlj + K,(ti))- (/5. + 

where Kj(tj) is the cm.f. of A?,-. Using these in (18) and dividing throughout by 
exp (Ki(ti) + KiUt)) I have 

Now recalling the definition of the c.f. of 5i, 52, (15), this can be simply written 






9 ^ 



222 


Lindley — Regression Lines and' the 


[No. 2, 


or more conveniently in terms of the cm.f. 


Obviously the extension to /i 4- 1 variables is 


n 


(“i - ad 


SU 


n 



( 20 ) 


1 have thus established that (20) is the necessary condition for the regression (16) to hold 
whenever (1) holds. A reversal of the argument will show that it is sufficient. 

I thus have the following: 


Theorem I . — Under the conditions o/ § 1.2, the necessary and sufficient condition that 
the regression of i on the remaining XiS be linear is given by (20). 

2.2. In order to put (20) in a more symmetrical form it is useful to prove a theorem that has 
been used before (t2) and a particular case of which has been used in Theorem I. It is well 
known in the univariate case, but it is perhaps worth giving here in the general case in terms of 
cumulative functions. I have, recalling the definitions in § 1.4. 


Theorem II. 

^ K(ii) 

t = l 


( 21 ) 


For 9x • • -/expL/ S tjXj]p(xx. r.„ . . . x„) IJ dxj 

i--l j-l 

// . . . /exp [/ 2: tjXj]ff . , ,f ll gj(xj - ij ) . 7r{5 1, . . . 5«) U dxyilj 

j‘ 1 


- -ff ^ • - I n G/o,5j)^(Si, 5*, . . . 5n) n dlj, so from (19) 

j-i 

- exp( I Kj)ff . . . /exp [/ S t/^j] tu( 5„ . . . U 11 dlj 

j-U ./-*! j-l 


exp( 2: Kj) . 9^. 

j-i 


Taking logarithms the theorem is proved, (12) is now obvious. 
Eliminating ‘ between (20) and (21) differentiated I have 

dti 


Sociv 


hi 


Sli 


^ Oi 


( 22 ) 


which shows that the left-hand side of the equation is an invariant. This form might have been 
obtained directly rather than by obtaining (20) first, but I preferred to obtain a form indicating 
the nature of the distribution of error, which (20) does, rather than a relation between the Xi and 
distributions. 

Using (21) the condition may be written 




H 

~ 2 ] a* 


dK, 
dti * 


(23) 


2.3. The most obvious feature of (20) is that it is a relation between the independent variables 
only. Accordingly is quite arbitrary apart from having to satisfy (20), for, it having been 
chosen, 5,*^ i can then be selected as any variable distributed anyhow provided only that (1) 
holds. Thus in discussing solutions of (20) we need not take account of the regression restriction. 



1947] 


Linear Functional Relationship 


223 


Second, if it was known that the regression of Xn + 1 was linear, then it follows that it would 
be a necessary condition for the linearity of regression of 5,* ,,. i that the relationship between the 
cumulative functions should hold. Under fairly general conditions this can be shown to be 
sufficient. The mean value of xz (I take /i = 2, again) is as before 


Xi = OiXi + a^i 


f/Lgd xx - 

ffgiiXi - SOS'aCJTa - ^2)dJidlz 


(24) 


If (20), n ~ 2, holds so does (17), and from this and (24) I have 

ff { - (ai5i + a*5a) ■ ^i(:ri — 5i)j?a(-r2 - 5a)ir(5i, ^z)dlxdlz -= 0 

for all Xu x^: under fairly general conditions this implies 


5a = ai5i 4- ao^a 

as required. Thus whenever one of the regressions exists and (20) holds the other regression 
will also exist. Furthermore this reciprocal relationship shows that an analogous theorem can 
be stated when one of the regressions exists and the errors are independent of a- and not of 5 
as here. 


3. The Case of a Single Independent Variable 

3.1. Before discussing the general case it will be helpful to consider the simple regression on 
a single random variable which has already been mentioned in § 1.3. (I) becomes 

and (20) becomes 

^25) 

where 4^^ is the cm.f, of Ci, only, and K the cm.f. of only. This can be integrated imme- 
diately giving 

(a - a)i>^ ^aK . . . . . . (26) 

the constant of integration being zero from the general properties of cumulative functions. Thus 
1 have 


Corollary I. — In the case of a single independent variable the necessary and sufficient 
condition that the regression continue to be linear when the independent variable is subject 
to error is that the cmf, of the independent variable is a multiple of' the cmf. of the error. 
Just as (20) was transformed to give (22) and (23), (26) may be put in either of the forms 

a+f =-■= (27) 

(oc - a)4^;c = aA: (28) 


Equation (7) may be obtained by using the relationship got from (27), provided the 
differentials exist, 



a 


dt^ 


with / =» 0, I have 


a{jt2o ^ ailXoo -\r (Ji2) 


as required: (renlembering X 20 — (^ao). 

Furthermore it is apparent that (26) is merely another way of writing (14), but the converse 
is not true since the cumulative function always exists, though the cumulants themselves may not 
since the function may not be expandable in a power series. An example is provided by the 
Cauchy distribution where ^{t) ^ — t . Thus (26) is more general than (14). 



224 Lindley — Regression Lines and the [No. 2, 

3.2. Suppose that the distribution of 5 1 is normal (this does not necessarily imply that 
7 i( 5„ 5?) is a normal bivariate distribution), then 

- i®*/* 

where o’ is the variance of 5i- Hence 

K=- -iC- 

a 

and is thus the cm.f. of a normal distribution so that Agi is normal, with variance - l)o*. 
Thus for normal distributions only a normal distribution of error will preserve the linearity of 
regression. This result has been obtained by Seares*® using a result of Eddington’s.^® 

Instead of enquiring what is the value of a corresponding to a given error one might equally 
enquire what sort of error would have to be made to get a given value of Tn the normal case 
it is clear that all values of (“/« - 1) which are positive (and no others) can be obtained by suitable 
choice of the error variance. That is (if a > 0), any a, ol > a > 0, can be found by taking 
var. (A5i) - - (“/« l)o2. But suppose is a general distribution then K ==■ (“/« ~ \)^( and 
the question immediately arises, is ("/,, — l)t|)f a cm.f.? For if an error is to give a ^ Oo, then 
this must have cm.f. (‘‘/"o - 1)'!'^. Since the sum of N independent drawings from has 
cm f. it follows that ("/a„ - is a cm.f. if “/<»o — 1 ^ /V, that is Oo - 1) where 

N is an integer. Thus it is always possible to make errors so that the regression is linear with 
a ^ olKN H 1). As far as I know the general problem of when a positive multiple (not necessarily 
an integer) of a cm.f. is a cm.f. has not been solved. In particular taking this multiple to be the 
reciprocal of an integer, N, the problem becomes, can the distribution with cm.f. 4^^ be regarded 
as the distribution of the sum of samples of N from another population with cm.f. The 

problem is thus of interest in other connections besides regression. 

Jt is not difficult to see that the (positive) power of a c.f. (corresponding to a multiple of a 
cm.f.) obeys the obvious necessary conditions for a function to be a c.f. (Kendall,^® p. 99), so no « 
c.f.’s are excluded on this account. The problem of sufficient conditions presents difficulties 
which have not been resolved. 

There are other distributions besides the normal which satisfy (26): for instance .v, may be 
distributed as x* vi (vj degrees of freedom) and as x** ~ (^a degrees of freedom) with 

Vjj < vi so that A5i is distributed as x* ” (vi — v*) with (vi — vg) degrees of freedom). (26) is 
then satisfied with a — ava/vj. 

3.3. The speculations in the last paragraph are of theoretical rather than of practical impor- 
tance. The practical requirement is to know what happens in any given case and this has been 
met. J shall speak of as the true values and Xi as the observed values. Thus far I have con- 
sidered the regression of on 5i, true on true, and of on Xu observed on observed. Similarly 
1 can consider the regression of on Xu true on observed, and Xz on 5i, observed on true. They 
exist because for fixed or Xi, Xz ^2 and accordingly there arc only two distinct regression 
lines which 1 shall spe^k of as the true (regression on 5i) and the observed (regression on x^) 
lines. Combining this result with (7) I have 

Corollary IF. — The four regression lines, which coincide in pairs, are given by 

1%-== aXx -=- Xz f ’ 

with 

as^ = aoj* ..... 

(ji* is the variance of atj; of 5i). The “bar” in the first of equations (29) means the average 
value for fixed 5i, in the second for fixed jr,: this should not lead to confusion. 

As a particular case of (1) it may happen that the variance of about is zero so 
that i is uniquely determined by a set of values of hU =-1,2,. . . w) and the regression can 
be written 

n 

i-1 


(29) 

(7') 


( 30 ) 



1947] Linear Functional Relationship 225 

without th6 “bar." I shall speak of this as dL functional relationship ; some writers use the term 
structural relationship. 

In the case of /i ™ 1 which is being considered here this gives - a^i, and thus equally the 
regression of 5i on 5a (reversing the role of dependent and independent variables) exists as a 

functional relationship 5i = - 5a ^ this case is of considerable importance. The variables .Vt 

and Xt arc, of course, not functionally related, but the regressions of .Yi on x-i, x * on Xx can be 
obtained from (7') and its analogous equation in terms of 52 and x^. 

Corollary III. — //*5i, ^2 are functionally related : 52 then x*, JCi are not function- 

ally related hut the regression lines of Xi on Xi and Xx on Xi {if they exist) are given hy 

Xa = axu Xi a'Xi 

with 

as^ - aoi*, a'sr - ^ era" a(Ti- . . . . (31) 

a 

In connection with the functional relationship, though true in the more general case, the 
following result will be found useful: 

Corollary IV. — If true and observed regressions exist the expected value of 5 1 from 
which a single observed Xi arose is given hy 


5, -= XiOx^jSx^ 


The distribution of 5i for a given value of Xj is 

7c(5i)i?i(A'i - 
/tc(5i)j?i(a-i - 5 

so the expected value of 5t is 


5i) 

,)^;i 


, IlMlx)gAxx 5 .)^/;. 

- lx)d\x 

Aia,V-vr 


UxXxI'^x 


(32) 


from (17) 
from (7') 


This result may be compared with one due to Eddington^^ which Scares^® uses to obtain his 
results. Some remarks on the meaning and use of this result will be found in Eddington’s paper. 
3.4. There is a result due to Allen® which I prove as 


Corollary V.--// 



(33) 


where 1 and m are constants: the necessary and sufficient condition that the regression of 
Xa on X, be linear whatever may be 1 and m and provided the vatiance of 5 is finite is that 5 
and Ac I should be normally distributed. (This is not exactly the result as slated in Allen’s 
paper because that requires that all the moments of 5 and A5i should be finite which is 
unnecessary.) 

Before proving the corollary I need a 


Lemma. — If 9 {x),f(x) are continuous, differentiable functions satisfying 

?(av) -= f{x)(^{y) (34) 

Jor all Xy y then 

rpiy) = (35) 

where A, r are constants. 

SUPP. VOL. IX. NO. 2. 


Q 



226 


Lindley — Regression Lines and the 


[No. 2, 


Differentiate (34) partially with respect to x 


hx 



or 


so 


^ 9(xy ) y 

‘ jc dx 


9(y) 


Hixy) X df (f>(xy) 
8y ~ fix) dx' y 


-=Kix) 


9ixy) 

y^^ 


say. 


using (34) 


whence on integration log 9 (xy) = K(x) log y -h C(a:) 

therefore 9(^.v) - 

putting Jf — 1, 9(v) - Ay^ with A ^ ^(1), r — if(l) 

giving the result required. 

Turning now to the corollary I take the particular case of theorem I when there is a functional 
relationship between 5i and Sa- Now Allen has assumed such a relationship (^i = /;, ^2 = fnl, 

so 5 1 ^ ^ 5,) and requires the regression to be linear whatever be / and m : clearly this is the same 
m 

as requiring the regression to be linear for a change of scale in both and axes. The latter 
is irrelevant but the former requires that (26) be replaced by 

(cm.f. of /5) constant times (cm.f. of 


for all /. If ^(r) is cm.f. of 5, W/) is the cm.f. of /5, whence it follows that on eliminating the 
cm.f. of A5i 

W0-/(/)W0 

with /(/) some function of /: so that, applying the lemma, 

4/(0 -- Ar 


Differentiating this twice and putting / — 0 gives a multiple of the variance of ^ which is not 
zero except for Dirac’s 8-distribution, so that r — 2 and the distribution of 5 and hence A 5, is 
normal. This proves the corollary. 

The essential distinction between Allen’s theorem and my Theorem I is the requirement of 
linearity for all I and w. The example given in her paper satisfies the requirements of Theorem 1 
for particular values of / and m and then the regression is linear, whereas it does not satisfy her 
theorem and is not linear for all / and /;/. 


4. The General Case of Multiple Regression 

4.1.1 first obtain the relationship between the a’s and the o’s. Differentiate (22) with respect 
to tj giving 


which on putting ==i o (/ = 1 , 2 . . . /i) yields 

SajpyOjOj = ^iTijSiSj 


O’ = 1, 2 . . . ») . 


(36) 


where aj*. V are the variances of 5.. x,- respectively, py is the correlation between and Ij and 



1947] Linear Functional Relationship 221 

nj is the correlation between xi and Xj (pu — ru — 1). These equations may most conveniently 
be written in matrix notation : introduce the row vectors 

a,, . . . a,^) 
a («,, Oi, . . . a,^) 


and let S and S be respectively the variance-covariance matrices of the 5's and jc's. (36) is then 
simply 

-^aS (37) 

It follows from Theorem II that 

= S H- A (38) 

where A is the variance-covariance matrix of the but since they are independent it reduces 
to a diagonal matrix with elements the variances of A5i. 

(37) may thus be written 

o^iS - A) = 

i.e. 

(a - = ot A (39) 

Now S is a positive definite matrix so that (39) has the unique solution 

(ot — flf) = aA5 ’ (40) 


I postpone discussion of this solution until later (§4.6): the points to notice at the moment 
are that the solution (40) is unique and that it always exists. 

4.2. I showed above (§3,2) that for simple regressions whatever be there would exist K 
such that the equation of condition was satisfied at any rate for some values of a: this was done 
by taking K = giving a = olKN 11). It is easy to see that this ceases to hold with multiple 
regression because has to satisfy certain conditions independent of the Ki. For taking the 
condition in the form of (20) and differentiating with respect to and ti with A' rjt A I have 



' a,) 




(41) 


This condition is independent of the Ki and accordingly must be satisfied by any before 
the nature of the A'* is considered. (41) yields all the equations 


Cli) 


I - 1 






, 8/,- ft-i ^ti + l 


8t,fn 


- 0 


when p = and at least two of the pi are different from zero: provided, of course, the differen- 
tials exist. In terms of cumulants this may be written 


^ (“t — ai)y.ipu Pi . • • Pi-uPiWyPi+i • • • Pn) • • • (42) 

-1 

with at least two of the pi different from zero. Now these equations, either in the form of (41) 
or (42), together with (40) provide at least w 1 - 1 equations in the n unknowns oli aii accord- 
ingly solutions will only exist for certain values of the matrix A, that is for certain values of the 
error variances. Now this is not the sort of situation one is interested in : what one wants to 
be able to say is that when is given (i.e. is of a given type) then the regression will continue 
to be linear provided Ki is of some other type irrespective of the variances of The fact that 
some distributions of special variance may produce linear regression is not usually of such interest. 
Hence I only consider distributions for which (41) vanish identically, that is for which 

= 0 

for all /, k, /, excluding the cases i — k ^ L 


(43) 



228 


Lindley — Regression Lines and the 


[No. 2, 


4.3. The equation of condition can be integrated in either of the forms (20) or (23) when it 
is an example of Lagrange's linear equation. Taking the form (20), it is integrated by forming 
the subsidiary equations 

, 

dt, _ dh ^ = A" .... ( 44 ) 


(/■ -= 1, 2. 


n - 1) 


n — I independent integrals are immediately found from 

du _ <f/{ 1 

Of; — Oi «£_! 1 1 

in the form |jl, - (a^ » - ai , xMi — (a^ — ad f 1 constant. 

Another integral, independent of these, is given by noting that each dtil{yi — ad equals the 

dK‘ 

right-hand expression in (44), so multiplying each by /(«/ - ad and adding 

«£ dK, 


which gives 

The general solution is 






Oi dti 


dh 


KMi)- 




v; 

X 'H- a i 


Ki 4 - L ({lu (^2 


• \^n - 1 ) 


(45) 


where F is an arbitiary function. But since J only consider solutions 
solutions are given by 


» 1^* », /-I 


satisfying (43) all the 


(46) 


where the c,j are constants with cij Cj/. 

Then for any the n{n — l)/2 coefficients Cjj can be determined from the n(n — 1 )/2 coefh- 
cients of r,;//, (/ /) in and then there are n [n(n -f l)/2 = n{n — l)/2 -f w] equations left to 
determine for any 8^* (the variance of A 5,) the values of the Oi. The remaining cumulants are 
then uniquely determined. 

The reader may object that it has been assumed that a^ ^ a,- for any /. If «,• a,; for some 

/ then 1 shall still assume (43) to hold for the same reason as I did in the other case. The solution 
then follows by continuity from (46) as the reader may verify by evaluating the Cjj as mentioned 
above and hence the t/,'s when he will obtain (37) which avoids the singularities. 

Thus the only solutions which are of general interest are those valid for all values of 
the variances of ; the other singular solutions may, in fact will, exist, but only suitable com- 
binations of the errors will provide linear regression. It may happen that in the singular solutions 
other values of the errors do not produce much deviation from linearity and thus for practical 
purposes will provide solutions: this remains to be .studied. 

1 have then the following 


Theorem III. -Under the conditions (?/’§ 1 .2 the necessary and sufficient condition that 
the regression of x,, . 1 on the remaining x\s be linear for all values of the error variances 
is given by (46) 

Corollary I to Theorem 1 is a particular case of this theorem. 

When the afs have been determined the remaining cumulants of the are given immediately 
from (46): whether or not these will form admissible cm.f. is unknown. 

4.4. The case where the 5, are independent is simple and may be directly compared to the 
simple regression of § 3. The c. f. can be expressed as a product 

11 9/^>(^) 

♦ -= i 



1947] .Linear Functional Relationship 

where is the c.f. of 5t» and hence the cm. f. is expressible as a sum 


i: W^)(0. 


(20) then splits up into the n ordinary differential equations 

which have been met before, (25), and integrate to give 


1 . 2 ,... //) 


229 


(«t «i) ciiKi ( 26 ') 

which is exactly analogous to the state of affairs in §3. The following results can be obtained 
as before : 

(i) --- a,^S^) (27') 

(ii) (^i - ad - 0L,Ki (28') 

(iii) = UiS^^ (29') 

so that if a/ > 0, Oil > a^ 0. 

(iv) A solution of (26') which is a cm. f. always exists by taking K^ 


(v) 5, (32') 

H 

(vi) Allen’s theorem may be extended to the case where it is required that 

» - I 

the regression should be linear for all a,*. The 5i, and hence the must be normally 
distributed. 


4.5. The normal distribution will obviously satisfy (43) and so is in the admissible class of 

distributions. is then a quadratic in and from (46) it is then obvious that Ki is a quadratic 
and hence that A?/ must be normally distributed. Conversely if A;^ (/ 1,2,... //) are 

normally distributed then the must be jointly normally distributed. 

A general type of distribution which satisfies (43) is given by 

^ m ii ~ l» 2, . . . n) 

where the have a multivariate normal distribution and the arc independent of the and 
of each other. The distribution of error then satisfies 

A^, a;,: 1 Ar;, (/ I, 2 //) 

where the A^t are normally distributed and the Ar/^ have distributions whose cm. f. are multiples 
of the cm. f. of lu- ^ his follows from the additive properties of the cm. f. 

4.6. The equations for the Ui have been obtained above, (37), and the method has been 
indicated whereby they result from the solution (46). If the are independent the matrices 

and S arc diagonal and the solution (29') follows immediately. The general case is more 
complicated : it ceases to be true (e.g.) that the slope of the “line” is decreased, it may be un- 
altered or increased, depending on the matrices ^ and A An inequality can be obtained, how- 
ever, in the special case where all the A^i have the same variance X: .\ is then equal to X/, where 
I is the unit matrix of order //. (37) is then 

- f/(23 j- X/) 
so 

a - a(/ + 

i.e. 

aa' fl(/ + 

or 

aa' — aa' ~ o(2>.^ ^ ^)a' 


(47) 



Lindley — Regression Lines and the 


[No. 2. 


—1 —2 

(a^ is the transpose of a). Now as S is positive definite so arc S and ^ , and thus the right- 
hand side of this equation is always positive, consequently 


or reverting to the longer notation, 


aa > aa , 




At one time I thought this relation held generally, but unequal errors can be arranged so 
that (48) no longer holds : for generally the right-hand side of (47) is 

i- + AS ®A)«' 

and the matrix of this is not necessarily positive definite. An example may prove instructive. 

if (4) is generalized to the case of n f 1 variables it follows that combining this with (36) 
one has 

H tf 

^ a,r,jSiSj = pj, „ I ^ .... (49) 

< « 1 I 1 

Take n 2 and let the variance-covariance matrix of Sa be 

/ 3 - 1 K 


\ 1 ~ I 4 ^ 


\ 1-1 4^ 

so that is the minor from the first two rows and columns 


Let A be 


so that 5 is 


(-j-i) 
( 0 2) 


The right-hand of the pair of equations (49) gives 

1 = 3ai — (x.i 
-- 1 -- — a, 4 - 2a, ^ 


the left-hand gives 


r,» 3Cjj 


1; a,- 


1 4^/1 — fla 

1 = — h 2a., 


1,02-= - 4- 

so that 

< o^ 1 - 

contradicting (48). 

This is rather an extreme case since 82 — 0 so that $2 is free from error, but any sufficiently 
small value of will, by continuity, suffice to contradict (48). 

The condition for ^ ai is easy to obtain from (39) : it is that the determinant of the matrix 
formed from 5 by replacing the column of S by the column vector Aa' be zero. For n -= 2 
the condition for ^/i -= ai is 

ai«i* tiiSiSi 

= 0 

a,Sa® Si* 


aiSiV ^ 



194^] Linear Functional Relationship 231 

which is possible if a ^ and a^r 12 have the same sign. Thus the slope may be unaltered despite 
the errors. It is easy to see that it is not possible for all the slopes to remain unaltered, for then 
a would equal a, and from (39) this would imply a = 0, which is impossible. 

4.7. Simpler expressions for the ai in terms of the aj may be obtained by means of the higher 
order cumulants. Using the notation of (42) and letting y-iip) be the cumulant of A?,-, (46) 
gives for pi >2 

>t(0, ... Pi, ... 0) - "*■ X,(p,) .... (50) 

oti CH 

If k{pi, Pa, . . . Pn) arc the cumulants of Xi (/ -- 1, 2, . . . //) then from Theorem 11 
>t(0, . . . , pt, . . . 0) I- Xifpi) A(0, ... p,-, ... 0) 
so in a symmetrical form 1 have with p, > 2 

ai>c(0, ... Pi, ... 0) = r/t^(0, ... Pi ... 0) . . (51) 

which also would follow from (22) using (43). 

5. The Interpretation of the Regression Lines and the Functional Relationship. 

5.1. In the preceding analysis two regression lines have been mentioned : 

+ j ^ (1) 

1-1 

and 

n 

« I I =■■ ^ "iXi (16) 

J 

The conditions under which the latter exists when the former does and the relationships 
between the two have been fully discussed. 1 now consider what interpretation is to be put upon 
them, what use they are in statistical methodology, and relate this to the previous work on the 
subject. The two lines I call respectively the true and observed regression lines but, for reasons 
which will appear later (§ 5.2), the most important case is where the true line is a line of functional 
relationship, so that 

^ (30) 

i = l 

As T have already said (§ 3.3), this is merely a particular case of the above analysis, and all 
the general results given above remain true when (30) holds instead of the more general (1). I 
suppose further that (30) is the only functional relationship which exists between the . (/ ^ 1, 
2, . . . « -r 1 ) : if there were another 

i '1 

this would imply 

^ (Pe-a05i-0 

i.e. there is a functional relationship between the li(i =1,2,. . . //). I suppose this not to be 
so : an idea of the difficulties which are present in the contrary case is to be found in a recent 
paper by Haavelmo’’. This paper contains several remarks relevant to the subject matter of 
this section. 

5.2. I consider the case of simple regression (n - 1), and start from equations (29). The 
second of these equations says that the mean value of either the true or observed value of the 
dependent variable is linearly related to the observed value of the independent variable, Xu and 
the slope of the line is the usual regression coefficient of X 2 on xi. Here then is a use for the 
observed regression line : given a value of jci (not 5i) the most likely or expected value of x% or $2 
is found by multiplying Xi by a. Similar remarks apply to the first of equations (29) : the mean 



232 


Lindley — Regression Lines and tfie 


[No. 2, 


value of either the true or observed value of the dependent variable is linearly related to the true 
value of the independent variable, 5i, and the slope of the line is the usual regression coefikient 
of Sa on 5i. 

Now Xi is the measurement of a value 5i and this is subject to an error which I called 
From the interpretations of equations (29) it is clear that the value of either JCa or 5a can be esti- 
mated from either or 5i (by estimate 1 mean its mean value can be found): from the former 
by using Jca =- 5a ^ axu from the latter by using Jc* — aJi. Since, however, it is only 
Xi and not 5i that is known, one would naturally estimate the dependent variable from the 
observed value jci despite the fact that this quantity involves an error. One, of course, would 
need the error to obey the conditions of Theorem 1 in order that both the lines exist. TTius the 
use of a regression line of a dependent variable on an independent variable is to determine the 
mean value of the dependent variable from the independent variable even if the latter is in error. 
The essential distinction between the two lines is that one refers to observed values of the inde- 
pendent variable, the other to true values. 

It would thus appear that the true regression line, relating as it does to 5i, is of little use since 
the mean value of the dependent variable can be found by means of the observed line. When 
the particular case of the functional relationship is considered it is evident that a knowledge of 
It is of importance, because most of the laws of the empirical sciences arc stated in this form : 
Ohm’s law, for example, and the law of density which T shall consider below. It is for this reason 
that 1 suppose (I) replaced by the less general (30) and when I speak of the regression line I refer 
to the observed line, unless otherwise stated. It is perhaps most helpful if I consider a simple, 
illustrative example. 

5.3. Suppose an experimenter has several pieces of metal of different shapes and sizes and 
he determines the mass and volume of each piece. Then it is reasonable to suppose that each 
observation is subject to an error, so that the observations of mass and volume may be considered 
the Xz and Xi respectively of Theorem I, and the true masses and volumes (which are ideals and 
can never be measured) are the $2 and 5i. The first problem 1 consider is how to estimate the 
density of the metal. The hypothesis of the existence of density is that ~ where a is the 
density: that is the true mass and true volume arc functionally related. Hence the question 
reduces to how to estimate the functional relationship. 

Leaving that for a moment I now pose another problem. Suppose the experimenter does 
not wish, for reasons of convenience, to carry out weighings, he would much sooner measure 
the volumes, and from these by means of the knowledge of the existence of a functional relationship 
predict the masses ot the pieces. Then the problem is how, from observed volumes, to predict 
true masses. Now Corollary III shows that observed quantities arc not functionally related, but 
they may be distributed so that the regression of one on another is linear. If they are so distri- 
buted the second of equations (29) shows the answer : there is not a unique true value correspond- 
ing to a given but the mean of the true values is known to be linearly related to the observed 
value Xi and the slope of this line is the regression coefficient of x. on jc,. Hence the question 
how to predict 5^ from ati is altered to how to estimate the observed regression. Here then are 
two problems which I call the problem of the functional relationship (estimation of density) and 
the problem of prediction, and it is the second of these that is answered by finding the regression 
line, whether or not Xi is free from error : if it is free from error, and only then, the answers to 
the two are the same. They are two distinct problems and should not be confused. In a recent 
paper by Wald*^ there is a discussion of the use of the functional relationship for the purposes 
of prediction, and an example shows that even when the former is known accurately the predicted 
values are biased. How the prediction is to be carried out is not stated : it is hoped that the 
above considerations show the answer whenever the error obeys the required relationship. 

5.4. It is sometimes assumed that linear regression is only valid if the independent variable, 
^ 1 , is free from serious error. For example, the following remark is quoted from Charnleyi^: 
*‘The use of the regression equation to describe a relation between two variates is legitimate only 
if one of the variates is free from any appreciable error.” He is correct if he refers to the 
functional relationship, but the regression equation continues to have a meaning if both the 
variates are in error: it describes the relationship between the mean value of one variate and 
the other variate, and is thus a legitimate weapon to use (e.g. in the problem of prediction) in 
any case. 



1947] 


Linear Functional Relationship 


233 


5.5. It should be realized that in order to estimate a regression any values of the independent 
variable may be chosen : there is no need for the sample to be a random selection from the whole 
population because, by the linearity, information got from a limited range of values of the inde- 
pendent variable supplies knowledge of the whole. The same is obviously not true of the de-' 
pendent variable where such selection would produce a biased estimate of the mean. The situa- 
tion has been discussed in some detail by Eisenhart^*, and I take the following quotation from his 
paper, in my notation his A", Y are Sa, his a:, y are Xu 

“It does not seem to be realized that the fitting should be done in terms of the deviations 
which actually represent ‘error.’ Thus when the research worker selects the AT-values in advance 
and holds x to these values without error and then observes the corresponding >»- values, the 
errors are in the y-values, so that even if he is interested in using observed values of Y to estimate 
X he ^ould nevertheless fit Y — a H bX and use the inverse of this relation to estimate X, i.e. 
X ^ {Y — a) lb with the best available estimate of Y substituted for K.” 

This method is not very satisfactory : if the A"-valucs were a random sample from the popula- 
tion, then in order to estimate X from Y the Y assumes the role of independent variable which, 
being subject to error, vitiates the use of Y ^ a bX to estimate X. But if the A"-values arc 
not so chosen then the only course to adopt is to use the fit Y - a 1 bX, but it should be borne 
in mind that it will not give an unbiased answer. Of course the problem is a little artificial in 
that it is postulating that X can be held to certain values whereas Y cannot, and yet Y is being 
used to estimate X. The solution of this problem rests rather with the experimenter than the 
statistician. 

5.6. In the analysis of variance the sum of squares can be split up into two parts — that due 
to the regression and that which is called the residual. Obviously the regression line and not 
the functional relationship is the correct one to use for the value about which the regression 
deviations arc taken, since it is desired to use the mean value of the dependent variable which 
is better estimated by the regression line. 

5.7. The reason why the practical importance of the distinction between the two lines is 
slight in many cases is probably explained by the fact that the difference between a and a due 
to the error in jci is quite small. Suppose, for example, that the standard error of an observation 
is one-tenth the standard error of the population of true values, which is quite a considerable 
error. Then Si* - ai* so that a - a jHi’ ^ change of less than 1 per cent, due to error. Thus 
in the measurement of density cited above (§ 5.3) the erroi would certainly be smaller than in 
this example and the two slopes arc for all practical purposes the same, at any rate in comparison 
with the sampling errors, though in prediction at points very far distant from the mean the effect 
might be noticeable. In economic data the errors are probably more material and the distinction 
may be more important here than in physical data. 

5.8. One advantage of the functional relationship over the regression line is that the slope 

of the former can be entered in standard tables as a constant whereas the slope of the latter would 
require the ratio from which it was determined to be tabulated as well. It would accord- 

ingly be useful if the problem of prediction could be solved using the functional relationship 
obtained from the tables. This may be done by using Corollary IV, which provides an estimate 
of the true values 5i from the observed jci provided is known; for combining (32) and 

the functional relationship 

'2 ^ ajCicji'-Avi*; 

this, of course, is the same as Sa - (fXi from (7'), as is otherwise obvious. The usual method is 
merely to get a from the tables and then say Sa ^ which is false: though, as 1 have said 
before, the error is slight in many cases. 

5.9. Reverting to the physical example, current practice for the estimation of the density, I 
feel sure, would be to calculate the quantity, sum of all masses over sum of all volumes. This 
is a method which has been advocated by Campbell*, and attacked from the point of view of least 
squares by Stewart^. It is probably only justifiable if the estimate is normally distributed, which 
is unlikely since most possibly the ob*served quantities will be, whence the estimate, being the 
ratio of two normally distributed variables, is not so distributed. The main justification for the 
physicist lies in its simplicity. 



234 


Regression Lines and the 


[No. 2, 


5. 10. Little more need be added in connection with multiple regression. It is usually assunied 
that the quantities are normally distributed, and Section 4 has shown that both the regression 
lines will exist if the errors are normally distributed. Similar remarks will then apply to the 
distinctions between regression lines and the functional relationships as have been given for 
simple regressions. In the general case it may be noticed, putting it rather loosely, that in view 
of the restriction (43) more opportunities are present for both regression lines to exist when /i — 1 
than in the contrary case. 

5.11. I summarize : 

(a) The functional relationship is required for the statement of laws in the empirical sciences 
(e.g. that of density) which would hold if no errors existed. 

(b) The regression line is required for prediction of either true or observed values of one 
variate from observations of the other whether or not this latter is in error: it being understood 
that the .vi from which is to be predicted comes from the same population as those XtS used 
in the estimation of the regression line. 

(c) The functional relationship, when available, in tables can be used for prediction provided 
true values can be estimated as in the method above. 

(d) The functional relationship and the regression line are the same if, and only if, the inde- 
pendent variable is not in error. 

6 . Estimation of the Functional Relationship 

6.1. The general problem of estimating the functional relationship has been tackled by 
Gcary*‘\ His method is as follows: he first observes that 

■^iPu . . . Pn) = k(pu Pi. . . . /?/,) (52) 

in the notation of § 4 . 7, where two or more of the pi differ from zero ; this follows from Theorem II. 

n 

He next observes that if S h 0 (a functional relationship in the homogeneous form) 

t ] 

then 

^txty.(pu Pi . . . Pi-u p, 1 K /?! , 1 . . . Pn) 0 . . . . (53) 

This may be proved by an extension to n variables of the proof of (11). Combining (52) 
and (53) he has 

-atA(/7i, Pi . . . Pi^. 1, Pi ! 1, i 1 . . . p») =- 0 . . . . (54) 

where two or more of the pi’s differ from zero. The A’s can be estimated from the data and thus 
from a suitably chosen selection of equations (54) the aj may be estimated. 

The similarities between equations (53) and (42) are obvious : remembering the slight difference 
in notation in the way the functional relationship has been written down it is easy to see that if 
the regression of Xj on the remaining x's is linear then (43) implies 

x(pi . . . pj.,, 0, pj I 1, . . . p,t) - 0 

or 

A(Pi . . . Pj 1, 0,pjH 1, . . . pj = 0 

with at least two of the pi different from zero, so that some of equations (54) fail. 

Thus the utility of the method diminishes when the requirements of linearity of regression are 
satisfied. The only case I shall consider is that of normally distributed errors and true values 
which is not covered by Geary's treatment in so far as all the are concerned: he does give 
a method of obtaining a sub-set of them. In the first place I confine my attention to linear 
regression. 

6 . 2. The remark 5.11 (d) suggests that if the usual method of estimating the regression line 
when the independent variate is free from error could be extended to cover the general case one 
would obtain the estimates required. This 1 proceed to do. ' Wilks^® (p. 1 57) introduces regression 
theory in the following manner: suppose for any value of jc, Xk say, y is normally distributed 
about axk -f b with variance a* (independent of jr*) so that jc is a fixed variate and y a random 



1947] Unear Functional Relationship 235 

variate, the distribution of y about ax b being regarded as the error ; he then requires to 
estimate b and a* which he does by the method of maximum likelihood giving 

« = - yOl'^iXk - x)\ 

_ I 

y= ax + b .... (55) 

where the sample is (x^ . . . x^; yt, Va, . . . y„) and the circumflex is used to denote maxi- 

mum likelihood ’estimates. His method depends essentially on the fact that the Xk are free from 
error: it can be easily extended to the general case for estimating the functional relationship. 
It should be noted that the suffix refers to the number in the sample, not, as before, to the variable : 
is replaced by 5, by 
6.3. My assumptions are : 

(a) The observations consist of n pairs of values 


(x^yi), U., Va), . . . (x„,y„). 

(b) Xk is normally distributed about ^k with variance S//Pk (k ^ I, 2, . . . //). (This 
includes the general case where the errors of each observation differ but the relative accuracies 
are known precisely.) 

(c) y'k is normally distributed about nk with variance ik - 1 , 2 .... /;), and 

id) nk - oc?, -IP (56) 


The problem is to estimate a, p, V likelihood is expressed in terms of them 

and they are unknown, Sji-, (^' — 1, 2, . . . n) (cf. Dent^). The likelihood function for a single 
pair (Xk, Xk) is 

(^k - ^ v^k - ^^kr<^k 

y/Pk VG; ^ 2a/ 

and thus the logarithm of the likelihood function for all the In observations is 


L - -Ji: log PkQk - « log “ 


2 ixk- IkYPk ^ ^ iyk - VkYQk 

28/ ^ 2V 


(57) 


8L 

SS. 

SL 

88, 

8L 

8lk 

8L 

8a 

8L 

8P 


The maximum likelihood equations are: 


0: 

«S/ 

. (58) 

0: 

«V - «!<• - 

. (59) 

- 0: 

to- - ik)Pklh^ + O't - aC, - WiGx/V = 0 . 

. (60) 

- 0: 

S(.V* - - P)l, .<?,. = 0 . 

(61) 

= 0: 

-0 . 

(62) 


But all is not well with these equations, for substituting (60) in (58) 1 have 
n8/ = ^(Vk - aE, - m^Q/8/IPk8*, 


n8,*lA^8/ = - Aik - WQiYlPk 


or 



236 


lAUDV^Y— Regression Lines and the 


[No. 2, 


Now imagine the experiment repeated indefinitely ; if the results are to be consistent each 
term on the right-hand side of this equation should estimate (^y^lQk)Qk*l^k ” ^y^Qkl^k- Conse- 
quently I should then have between true values 

- 8/^QklPk 

hence 

^63) 

where K - is independent of a. Whilst this gives the maximum of the likelihood it 

does not provide a sound division of the error between jc and y since (63) will not generally hold 
between true values. (The example is clearer in the case Pk -= Qj^ when (63) is got directly.) 
The equations have been solved by Dent’ without regard to this point : T will include a further 
criticism of her method later (§ 8.2). 

Thus the maximum likelihood estimates are not entirely satisfactory. The fault does not 
seem to lie with the method but rather with the fact that efficient statistics do not seem to exist. 
1 shall show later (§7.2) that if the distribution of is normal they certainly do not exist. 1 am 
thus forced to make another assumption ; the most convenient one seems to be : 

(e) The ratio V/V is known, equal to X. 

6.4. In place of (58) and (59) I get (omitting the suffix from S/ as being redundant): 

2//^^“> AX(xk - ikYPk 4- 2(y, - d4 - ... (64) 

(60) become 

^k)^k 4^ (yk ~~ “ ih^Qk ~ 0 . . . . (65) 

and (61), (62) remain unaltered. (61), (62) and (65) arc then n 2 equations for a, p and 5^.. 
They cannot be solved explicitly except in the special case where Pk - [iQk but by suitable choice 
of X, (1 may be taken to be unity. In other cases resort must be had to an approximate method. 

In the special case substitution of 5 a from (65) in (61) and (62) gives 

(X I- d*)i:0x(.VA- P)(X.Y^ + - a?) ^ 4- - dfi)** . . (66) 

(X 1 'x^)1,Qk(yk -- f3) ^^Qki^-^k 4- 5tVk — dp) 

the second of which yields 

jr -- dx - p - 0 (67) 

where x, ^ are the weighted means. Using (67) I transfer the origin so that x = j =- p = 0 and 
then (66) yields the quadratic equation for d, 

d(.y,„, X5^r) H- (X - - 0 .... (68) 

where 

Sxx -- - xy^ 

•'>// ^ \^Qk{xk - x){yk - y) (69) 

•V// - yY 

giving 

^ I \/"h ^ + X (70) 

with 

^ W// ~ • • • . . . (71) 

(67) and (70) provide the maximum likelihood values of a and p, substitution in (65) will provide 
tlw Ik (if i^ded) and in (64) will give 8=^ (see below, § 6.6). The positive sign was used in (70) 
becau^ this can be shown to give a maximum to L : the other root provides what is called the 
line of worst fit: it is the line passing through the point (j?, y) with minimum likelihood. 



1947 ] 


Linear Functional Relationship 


237 


The estimate of a provided by (70) is the same as the estimate of the line of best fit given by 
many previous writers whose work I shall discuss later (§8.1). As a result of the above analysis 
it can now be seen that it is also the maximum likelihood estimate when the errors are supposed 
normally and independently distributed. Furthermore 1 notice that as X -> oo , 8-^0, X8=* 
remaining finite, the results tend to the usual results of Wilks when V ^ 0. 

6.5. I now show is a consistent estimate of a. For I have the following equations: 

^ + ^(n - 1)/// ' 

^i^xy) ~ • .... ( 72 ) 

• E(Syy) = -f- X8®(w - l)//7 

where s^rf are the same quantities as those defined in equations (69) with >/ in place 

of jc, y, I also have 

^rir, ■ '■= - 5 ^% ( 73 ) 

in view of the functional relationship. 

Since Syy converge in probability (c. i. pr.) to the limit of their expectations provided 

the fourth moments of certain distributions exist, it follows that B, equation (71), c. i. pr. to 

CS>j - 

- (a2 - X)/2a 

whence from (70), a c. i. pr. to 



as required. Then equally p is a consistent estimate of p. 

From this proof of consistency it is clear that it holds whenever (72) holds and Syy 

c. i. pr. : furthermore from solving (72) and (73) for a it can be shown that (70) is the only con- 
sistent estimate of a employing second moments except for minor alterations (such as using the 
factor n instead of /i - I in dividing a sum of squares). Thus the estimates of a, p have some 
justification even when the errors are not assumed to be normally distributed. 

6.6. 1 now turn to the question of consistency of 8-: to find 8-; i, p, b^ive to be substituted 
in (64). First using (65) I have (remembering Pk ^ Qk) 

(aVX 1 \)I.(yk - oiik - [IrQk 

= (avx -f D^iVk 

in virtue of (61) and (62). Using (65) again 

2nh'^\ + 1 )^Gii-3’A(» ~ ^ r \yji - 

-- '^QkVkiy'k - ^Xk - P) 

'^Qkykh'k -- ^Xk ~ y V x'y.) from (67) 

hence 

2S2X {Syy - ^Sjry) ‘ . . (74) 

~ ^ ~~ ’^xxy 4x5^^^“]^ - 

From (72) it follows that the right-hand side of (74) c. i. pr. to 

- aS- from (73) 

so that 8® c. i. pr. to 8®/2 and is thus not a consistent estimate of 8®. It would appear that this 
time it is the method that is at fault. 8® is obtained by dividing a sum of squares by a factor 



238 


Lindley — Regression Lines and the 


[No. 2, 


2/1 ; as is well known, the method of maximum likelihood when applied to the normal distribu* 
tio'n in order to estimate the mean and the variance gives the factor n instead of /i — 1 for the 
divisor of the sum of squares in estimating the variance: here again the factor is wrong, but this 
time more seriously since the estimate is not now consistent. The correct factor is, as in the 
variance estimate, the number of degrees of freedom, not the number of observations : here this 
is 2/1 - /j >- 2 = /I - 2. The factor is that given by least squares. 

6.7. In the general case where Pj. Qu the likelihood equations can be solved in any numerical 
case by an iterative method due to Fisher® or, as 1 shall show later (§8.4) may be treated as least 
squares equations and solved by a routine devised by Deming'^ : essentially these two methods 
are the same. As a first approximation for the method the usual regression coefficients could 
be used (Equations (55) with weights Qi). 

6.8. The extension of the above method to more than two variables involves no difficulties 
that arc not purely algebraic. Assumption (e) must be replaced by an assumption to the effect 
that all the errors arc known relatively, otherwise the method breaks down. Further the estimate 
of errors will c. i. pr. to 8*//^ rather than 8% where m is the number of variables, 

6.9. The advantage of the above method is that no mention is made of the distribution of 
the true values : any set of values may be chosen without any attempt at randomness ; compare 
the remarks in § 5 . 5. When, however, one considers the problem of the existence of the regression 
line of y on jc this cannot be done without reference to the distribution of true values. If the 
errors are normally distributed then 1 have shown that a normal distribution of 5 is essential for 
linearity of regression of y on x : then n would be normally distributed and hence the regression 
of X on y would be linear. 

The regression lines are often considered from a slightly different point of view : rather more 
the view I took before 1 mentioned functional relationship. This starts from a bivariate popula- 
tion such that the regression is linear and on this basis estimates are obtained of the lines : com- 
monly this population is supposed bivariate normal. The remarkable thing is that the estimate 
of regression of y on x is the same as that given by the above method when a: is free from error, 
equations (55). But the bivariate normal population may be regarded as compounded of a 
functional relationship and an error in the dependent variable: thus this latter method is based 
on the same set of assumptions as the other with the additional information that jc, or 5, they 
now being the same, is normally distributed. We thus have the position that this additional 
information is irrelevant to the estimation of the regression line. Since % is not subject to error 
this is the same line as the functional relationship, see § 5 . 1 1 (d). 

Equally the bivariate population may be regarded as compounded of a functional relationship 
and errors in both the variables: I now go on to show that the estimation of the functional rela- 
tionship is the same as before, so that the additional information about the distribution of 5, 
not X now, is irrelevant to the estimation of the functional relationship. Now, of course, I can 
talk about the regression of y on at, and to make the discussion fairly general I can consider the 
case where no functional relationship exists but merely two regressions, observed and true. 

7. Estimation of Regression Lines 


7.1.1 assume 


(«) 7t(5, ti) is a normal population with ^ 0. 

(^) A5, A?/ are normally, independently distributed so that Corollary I is satisfied. 

Then p(x, y) will be a normal distribution. 

1 first make a remaik about maximum likelihood estimates in general. 

Suppose the likelihood function, L, depends on n unknown parameters Bi, 02, . . . 0„: to 
estimate these the equations 

80<~’ ^ (i — 1, 2, . . . /i) 

are solved. If now it is desired to estimate <Pi, 9a, . . . 9n, functions of 0i, 0t, . . . 0^ the 
equations 



(i =1,2,... /i) 



1947] Linear Functional Relationship 239 

would have to be solved. But since 

^ ^ S SL 50* 

59t ^ 50* 5(pi 

these two sets of equations are equivalent provided the Jacobian 


5(0,, 0«, . . . 0.) 

S(9i. 92» . * . 9n) 

does not vanish. Accordingly if 02, . . . 0,, are the maximum likelihood estimates for 0,, 
01, . . . 0nand9t = 9i(®i. 02, . . . 0«)then the maximum likelihood solutions for 9i, 9., . . . 9,, 

9i - <Pt(9i, e*. . . . e«) (1 --1,2,...//). . . (75) 

Hence once any functions of 0i, O2, . . . 0,, have been estimated the G,’s can be estimated by 
solving equations of the type of (75). Similar remarks apply to least squares estimates. 

7.2. Let / aj 


be the variance-covariance matrix of p(x, v). The maximum likelihood estimates of a/, a,,-, 
are known (KendalP®, p. 338) : they are 


V = S(a-* - xy 

V = - yf 

pjy - ^ S(jft - x)(yi - y)laj:d„ 

n 


(76) 


if the means have also to be estimated. These may be compared with (69) with 0* == 1. 

For the bivariate normal population these three statistics (with the means) are sufficient 
statistics. 

From Theorem 11 

n/ -- h 5/ \ 

V s' + V I ('7^) 

Pry ~ PfrjSS/SS ' 


and for the maximum likelihood estimates I have the same equations with the circumflex over 
everything, equations (IT) : these latter are of the type of (75). Since there are live unknowns 
on the right-hand side of (77') it follows that they cannot all be estimated from the sample, which 
only provides estimates of the left-hand sides. Accordingly two of the five quantities must be 
known before the others can be estimated (cf. §6.3). If 5,, 5^ arc known the maximum likelihood 
estimates of a,,* and p^,, are 

- V - V 


. P^Tj — Pyry^x^yl^i^ii 

with a^, Oy, p^y given by (76). 

a is given by and is thus estimated by 


& = 


Pxy 






(78) 


which might also have been obtained from a<jf* = aa/. In fact, (77) are merely another way 
of writing the equations aS = aS that I had before (§ 4. 1). 



240 


Lindley — Regression Lines and the 


[No. 2, 


Other examples of the use of (77) can be treated similarly. In any case two of the quantities 
on the right-hand side must be known. 

7.3. Suppose 1 ) = a?, (77) can be written 

a/ - a* + 8^* 'I 

+ V j C79) 

] 

where the suffix has been dropped from and Pf,, = 1 . There are now only four 

unknowns on the right-hand side : assuming as I had to before that 8^* = XS/ (79) become for 
the estimates 

hx^ = -h 8* 


Solving for a 


V = aV + X^ 

9 xy hx^lbxOy 

V ^PxyOx^y -f — ^xy^xOyloL 


giving, in the notation of (69) with Qi ~ 1, 


^i^yy 4 ' (^ ^^)^xy — ^ 

which is the same as before, (68). Hence the same estimate of a is used whether the population 
is assumed normal or whether no assumption is made about it at all. Of course the argument 
here is algebraically similar to that used for the consistency proofs in Section 6: the extra point 
here is the fact that 5^,,, Syy is a set of sufficient statistics. 

As before a is a consistent estimate of a : solving for 8* 


X82 -- a/ — a* a* == (Syy — oLS^y) 


which, on comparison with the expression obtained in section 6, (74) is clearly a consistent estimate 
of 8*^ is estimated by Sj.ylv which did not occur before. 

7.4, All this is valid if a weight Q^. is attached to every pair of readings (x-x,., for they may 
be regarded as repetitions of the same pair. of observations each of unit weight. Accordingly 
the estimates cover the same case of Fk -- Qk that T was able to solve before and are as general 
as those. The case of general Pk, Qk is not open to this treatment since they cannot be regarded 
as simple repetitions of a pair of observations, but only as the repetition of individual observations. 

7.5. (79) may also be solved under other assumptions besides = X8/, e.g. that 8^ is known 
(this case has occurred in industrial practice). I then have 


giving 

so 

also 


V - -4- 

CT// ~ 4 8 * 

?xy ^ 

^Pxy^x^y 4* 82 - or/ 
^ " i^yy ^^)l^xy 

8 / (ocAVj. — Sjy)l6L 


All such estimates are obviously consistent. 

7.6. It has recently been shown by Wald^^ that if the distribution of S satisfies the condition 


1 I 1 + • • • + 5m) - (5m+. + . . . + 5„) ] I > 0 

for m < n, then it is possible to find consistent estimates of a, p, 8/ and 8 ». The method consists 



1 * 947 ] 


Linear Functional Relationship 


241 


in dividing the observations into two groups, (xi, . . . (jr^, y^) and (jc,h+ i, 1 ), 

. ‘ • (Xm ^n)* SO that the condition is satisfied. Unfortunately as the inequality is strict it is 
often not possible to do this: some examples are given by Wald for which it is possible, but they 
would appear artificial. The normal distribution which 1 have discussed obviously does not 
satisfy the condition for any division. 

Most of the work that has been done in the past has used the methods of least squares, and 
in order to show the similarity of the above results with those of least squares as well as to put 
the subject in its historical background, it is to these methods that I now turn. 


8. Estimation by Least Squares 

8.1. The method of least squares requires no assumptions about the probability distributions 
as it applies to any group of ordered pairs of observations though it is not easily Justified except 
in certain cases. The literature relevant to the problem here considered dates back to 1879, when 
Kummelf treated the general problem of least squares : he solved the problem of “best fit” for 
a line (and for other curves) under the assumption that ~ and 8/ -= 8^,®. However, the 
paper was in an obscure journal and passed unnoticed. K. Pearson^ tackled the problem of the 
best fit of lines and planes and obtained, in some ways special cases, in some ways generalizations 
of Kummelfs results of which he was ignorant. In all cases he minimized the sum of squares 
of perpendicular distances from the line. Gini^ did some work on the subject and obtained the 
same estimate of the line as Pearson. The work of Dent’ has already been referred to (§ 6.3). 

8.2. The interesting discussion begins with Roos's® paper in which he draws attention to the 
fact that all the previous methods have depended on the choice of the coordinate system. He 
obtains the conditions that the function U of the observations to be minimized must obey in 
order that the fitted line 

ax I by f c 0 


be invariant under homogeneous strain and translation : it must be of the form 

U ^^QJiax, -} by, 4- r) 

where / is an arbitrary function and the Qk are weighting factors. He chooses 


and assumes the ratio 


Jiuxk -f hvk [- c) (axk I- />>> \ cf 


error of /error of 

to be a constant k \/X in my notation. By this means he fits a straight line of slope 

hSj.Jp 

But this is only a consistent estimate of a if 


a(a — X).yff -f- 
(a - X).y^^ -- kn^ 


by use of (72) and (73), which is true only if S* 0, X -> co or ^ -- a. Roos observed that 

these three cases provided the usual estimates. 

There is, as Roos appreciated, a certain arbitrariness in his solution, and therefore it would 
seem necessary to choose some additional criterion to select the “best” estimate from all possible 
invariant estimates. Now Roos does not seem to have noticed that Kummelfs solution (which 
is the same as mine) is invariant under the conditions that Roos considers provided account is 
taken of the fact that X varies under a strain, as is clear from my assumptions (§ 6.3). Suppose 
the strain is x = IX, y mY, then X Am*//* and the new estimate has 


(lY — (jtYSyy -^ 2 * A/*5j|;j;)//m5j-y ^ 0 


SO that the new slope is ma// as required for invariancy. Hence this solution has the advantages 
SUPP. VOL. IX. NO. 2. 


R 



Lindley — Regression Lines and the 


242 


INo/2, 


of invariancy and consistency, and as I have remarked before, is the only consistent estimate 
employing second moments. Thus I reject Roos’s solution. 

It is to be noticed that more general invariant criteria are not of interest : in the density problem 
cited above it is the ratio mass to volume that is of interest and not some general transformation 
of these quantities. 

This criterion of consistency is another reason for rejecting Dent’s estimate of the slope, 

without assuming X known, i.e. 

^yyl^xx* 

This solution is just as arbitrary as that which assumes X = 1 to which Miss Dent objects. 

8.3. A recent paper by Seares^® attempts to estimate the functional relationship and the 
regression lines in the general case Pk^ Qt' he claims to do it by least squares, but although 
the solution for a is consistent, the least squares solution is that indicated below, which differs 
from his. Furthermore the estimate of the regression line is wrong through a misunderstanding 
of the technique. What he does is to assume both V and Sy* known, which is redundant as has 
been shown above (sections 6 and 7); he then obtains two solutions for a which are similar to 
those in § 7.5 and the corresponding situation with V known. The least squares solution for 
a is then got, according to him, by taking the weighted geometric mean of these two solutions: 
this is not a minimum of anything as far as I can see. However, the solution is a consistent 
estimate and is not too difficult to calculate. 

I consider the problem afresh: first, the estimation of the regression line,\v = ax. It is 
clear from the remarks above concerning the distinction between the functional relationship and 
the regression line that the regression line which is commonly used makes no mention of the 
true values 5* • it is a regression of on x and it is x, and x alone, that is of interest. A quantity 
cannot be said to be in error until you introduce some notion of true values ; here one asks simply 
for the regression on some observed values, some perfectly real readings, and the population 
underlying the theory is the population of the observed values. Assuming then that the regression 
is linear, at any rate approximately, with unequal weights, the error of x is not relevant to the 
regression of y on x. The error of however, is relevant : for any value of x there arc many 
values of V and there is a distribution of y for this given value of x: I wish to estimate the mean 
of this distribution and to do this the knowledge of the error of y is used just as it would be in 
estimating the mean of any population. 

1 can now formulate the least squares method. This consists in minimizing the weighted 
sum of squares of deviations: ideally pk — axk + the deviations are accordingly (yk — aXk — b\ 
where a is the regression coefficient of y on x. The weights are those of yk only, so the sum 
of squares to be minimized is 

^Qkiyk - aXk - by 

yielding the usual solutions 

d ~ Sxy/S^x ^ 

T / ( 80 ) 

b ^ y ^ ax, ) 

where the tilda denotes least squares estimates. 

8.4. The true regression lines are not amenable to attack by least squares methods as a know- 
ledge of the error is required beforehand, which error is always estimated from the minimum of 
the sum of squares. However the estimate of the “best” line or the functional relationship can 
be found : the argument is the same as that above for the regression line except that now x is in 
error because the true values have a meaning and thus the weights are no longer Q*. In fact 

var (yk - ox* p) = var (jk) + var (x*) . a» 

- V/G* + 

I have to bring in the same assumption 8^* = XV, whence 

var Ck* - OCX* - P) - W/Ca + ^VPk) 
and the weight of Cv* — ox* — P) is 


PkQklO^Pk + «‘Q*) 



\9A1\ Unear Functional Relationship 

so that the expression to be minimized is 


243 


2 wfia.'-’'* *■> 

(Most writers implicitly assume X known, equal to 1, by assuming are known relatively, 
that is are measured on the same scale.) The main feature of interest in (81) is that the weights 
contain the unknown a: accordingly they must be included in the differentiation; this docs 
not seem to have been admitted as a possibility before. Seares trips up on this point when he 
attempts to use such weights in the (wrong) estimation of the regression line: he merely uses 
weights, Rk say, obtains an answer in terms of them, and then investigates the nature of these 
weights, saying that the theoretical difficulty that the Rj, contain the unknown a is of no 
consequence. That it is of consequence may be seen by comparing the results. 

Again (81) can only be minimized with Pk^ Qt - the general result can be obtained by the 
iterative method mentioned before. Differentiation with respect to p immediately gives the 
second of equations (80), then transferring the mean so that p differentiation with respect 
to a gives 

^Qk(^ + ^^)(yk - ixk)xk H- ^Qki(yk ~ = 0 . . . (82) 

i.e. 


-f- a[X5;rj- ■” t) 

which is the same quadratic equation as before, (68). Hence 1 arrive at the usual estimate of 
the line of best fit as given by Kummell* (quoted by Deming'*). The estimate of error is provided 
by the minimum sum of squares divided by the degrees of freedom /i — 2 : this sum is 



where the origin is such that p = 0. (82) can be written 

^"^Qkiyk “* = ^Qki^ + hXk)Xk 

hence the estimate of V is 

V = ” 1-2 ^Qkiyk - ^Xk)ykl'>^ 
so 

~ /I — 2 i ^ 

which may be compared with (74). This shows least squares technique provides the same 
estimates of a and p as the maximum likelihood in both the cases considered above and provides 
a consistent estimate of the error. There is no justification of this latter use of least squares 
comparable with the very elegant Markoff'® theorem (more correctly called Gauss’s theorum) 
of which the estimation of the regression line provides an example. 

8.5. These results have been obtained by many writers (as mentioned above) by minimizing 

^Pk(xk - ^kY + ^QkKyk - - W* 

with respect to a, p and 5;^ (^ = I, 2, . . . n): they yield (61), (62) and (65) and hence the same 
estimates of a, p and The advantage of the above method is that the redundant 5* are never 
mentioned. 

8.6. Sections 6, 7 and 8 may be summarized very briefly by saying that a should be estimated 
by (70), p by (67) and the error by 

{Syy Oi’a-y) 

or with the additional factor nl(n — 2) on the right-hand side. It being assumed that Pk = Qk* 
= XV and either 

(a) errors are normally distributed (§6.4) 



244 


Lindley — Regression Lines and the Linear Functional Relationship 


[No. 2, 


or 

(b) the population 7 t( 5 , n) is bivariate normal (so that a can be estimated as well) 
{§ 7.3), 

or 

(c) whenever least squares methods are used (§8.4) 

a should be estimated by s^yls^^ and ^ a^ with the weights of y only when either 

{a) Tr(5, n) is bivariate normal and Pjc Qk (§ 7 . 2), 
or ... 

(b) whenever least squares methods are used and whatever be the weights of (justified 
by Gauss’s theorem) (§ 8.3). 

If 7r(5, vi) is bivariate normal and 8^/ the equations (79) can be solved under other 

assumptions to yield estimates of a and the error (§ 7.3), and can be used in the form (77) whenever 
the assumption of the existence of a functional relationship is not used. 

If Pig Qy in the above methods, except when tu(5, n) is assumed normal, iterative methods 
of solution are available. 


This work has been carried out as part of the research programme of the National Physical 
Laboratory, and this paper is published by permission of the Director of the Laboratory. 


9. References 

‘ Kummcli, C. H. (1879), The Analyst (Des Moines)^ 6, 97. 

• Pearson, K. (1901), Philosophical Magazine, 2, 559. 

^ Campbell, N. R. (1920), ibid, 39, 17“^. 

^ Stewart. R. M. (1920), ibid, 40, 217. 

“ Gini, C. (1921), Metron, 1 (3), 63. 

• Koshal, R. S. (1933). J. Rov. Stat. Soc., 96, 303. 

^ I3cnt, B. M. (1935), Proc. Physical Soc., 47, 92. 

• Roos, C. F, (1937), Metron, 13 (1), 3. 

• Allen, H. V. (1938), Stat. Research Memoirs, 2, 60. 

David, F. N., and Neyman, /. (1938), ibid., 1 105. 

“ Doming, W. H. (1938), Some Notes on Least Square.s. Washington: Department of Agriculture. 

Eisenhart, C. (1939), Ann. Math. Stat., 10, 162. 

” Eddington, A. S. (1940), Mon. Not. Rov. Astr. Soc., 100, 354. 

Wald, A. (1940), Ann. Math. Stat., 11, '284. 

Charnley, F. (1941), Canadian J. Research, A. 19, 139. 

•• Geary, R. C. (1942), Proc. Roy. Irish Acad., A.47, 63.— (1944). ibid., A.49, 177. 

'' Haavclmo, T. (1943), Econometrica, 11, 1. 

" Kendall, M. G. (1945), Advanced Theory of Statistics, Vol. I. London: Griffin & Co., Ltd. 

Wilks, S. S. (1946), Mathematical Statistics, Princeton: University Press. 

** Seares, F. H. (1944), Astrophysical J., 100, 255. 



1947] 


245 


Grouping Corrections for High Autocorrelations 

By H. E. Daniels 
Wool Industries Research Association 

1. Introduction 

In applying Sheppard’s corrections to bivariate second order moments it is customary to assume 
that the product moment requires no adjustment. While the assumption is usually true, the con- 
ditions for its validity no longer hold in cases of nearly perfect correlation, and if Sheppard’s 
correction is then applied to the variances while leaving the covariance unaltered, the adjusted 
correlation coefficient is sometimes found to exceed unity. The behaviour of the grouped product 
moment is essentially determined by the effect of grouping on the residual errors about the regres- 
sion line, and difficulty arises when the standard error of the residuals is small compared with the 
group interval, even though the latter is itself small relative to the standard deviation of the 
individual variates. 

The question of the correct adjustments to make in such cases assumes some importance in 
work on stationary time series when interest is centred on the high correlations corresponding to 
short lags. Cunningham and HyndC), for example, consider the survival chance of a target 
subjected to short bursts from a gun whose successive shots are highly autocorrelated in aim. 
Their experimental data are obtained under rigorous flying conditions and may have to be recorded 
in rather crude group intervals, making if necessary to adjust the resulting correlograms for the 
effect of grouping before using them to calculate target survival chances. A similar situation 
occurs when the autocorrelations are calculated on a relay computer of limited capacity, the 
original data having to be grouped to lie within the scope of the machine. 

It is to be expected that the product moment grouping correction will be sensitive to the form 
of the distribution of residuals when there is high correlation. We consider in the present note 
the correction to be made to high autocoi relations when these residuals (or more appropriately, 
the differences between variate values separated by the lag intervals) arc normally distributed, and 
the effect of a moderate degree of non-normality is discussed. The results should be applicable 
to long stationary time series, where the distribution of the differences is often found to be nearly 
normal. Short sample series whose length is of the order of a few “periods” are not likely to 
satisfy the required condition. 


2. The General Bivariate Case 

Let Xi, Xa in the bivariate distribution JCa) dxidx^ be grouped respectively in intervals of 
width toi, (Oa, the centres of the initial intervals being located at ^i, Sa- The grouped frequency 
corresponding to intervals ri, ro is 


F{ru rt\ 5i, ^ 2 ) 


J dxi J dx^fixi 








and the characteristic function for the grouped distribution is — 


^ 2 ) — 2^ 

- 00 


00 i 7 i(r,w, -f I- + f,) 

Se F(r., 5., W 

r, •= — 00 


which has periods o>i, Wa in 5i, ^2 respectively. 
Fourier series 

00 

A/(/i, t^\ ^ 1 , ^a) ” 2< 

— 00 


! 

Following FisherC), we develop A/ as a double 








te>, J 



246 Daniels — Grouping Correctio/is for High Autocorrelations 

the coefficients being 

/»,. f dlt MUut,-, 5i,5.) ^ 


[No. 2. 


\ui 00 00 ll»>l 

= 2 z j '■>“1 ■*' 

— 4w, — |<ut fj— — 00 r, *= — 00 — iwi — 

X exp (f, — (r,M, + ?,) + / (/j (/•»“« + 

, •». i (,. _ ?5^) ?«.) *. i (e. - ‘ (‘* - ^)‘- 


- JdxJdx , 

— i<ui ~ l<*>i 

_ sini^t^iti — TTi^i) sin(\oi2ti — Tc^a) 

1 “ 

where 


00 00 
e 

— CO — 00 


■/ /< 


y(Zj, Z2) dzidz^ 


A _ 2w, 

— Jtsi) ' — Wa) \ * o)j ’ 


2ns,\ 
“2 / 


' itxXi + 

e f{xi, Xi) dxt, dxt 


Hence 


9 (tu =y J* 

— 00 — GO 

M(t t E > - y T ~ <P fti - '2 “ ) e ^ 

— 00 — 00 ^ 

and the moments jx of the ungrouped distribution are given in terms of the grouped moments nu 
as far as the second order, by 

2nig^i 

1^1 = «»i - Z ( - )‘ ^ {- 

6 =» " CIO 

00 gff fgf i 

Ci>i2 457^ O),^ / 2t^S \ "1 ,, P- 

1^0 = "'20 - j^ - I ( - )’ 2nii« 'P (— <01 ’ ^ “ 

« » — 00 

Z ' x*» *■ <*>iWo / 27t.Vi 27rs2\ “ H wi w, y 

2 < - > <^4 , » (- M 

«« — 00 «, sw — 00 

(£' denoting summation excluding s ^ 0). 

The first two are the ordinary univariate formulae, and subject to the customary conditions 
under which Sheppard’s corrections arc valid the remainders may be neglected, since co is assumed 
small compared with o == Vi^To- As pointed out by Kendall(»), if the group origins 5i and 5* 
are considered to be located at random, all the remainders, including that of (x^i, average out to 
zero and the average relations 


(i-i — /Wj^, (^20 — W20 12* 


hold wnatever the size of the group intervals or the value of the correlation. But this remark is 
not relevant to the present problem, since the operations of averaging m*o, w«, and mu are not 



247 


1947} Daniels— Corrections for High Autocorrelations 

independent and it is not legitimate to calculate a correlation coefficient from the average grouped 
moments in cases where the remainder for is not negligible. 


3. The Corrections for High Autocorrelations 

Autocorrelation coefficients calculated from stationary time series have the simplifying feature 
that the group interval co and group origin I are the same for both variates and Xt, The dis- 
tribution for each of the correlated variates is the same, though their joint distribution is not 
necessarily symmetrical in Xu Xt. (Consider, for example, the time series where 

the distribution for the independent s,.’s is skew.) Nevertheless it is convenient to work in terms 
of + Xt and Xt — x*, whose joint characteristic function is iKw, v) = 9 (w 4- v, « — v). The 
product moment formula (2.2) may be written in the form 


{Xii = 


00 

- 


«=»i 


Wa 

471252 




(3.1) 


(2^ + r)7r\ 
(•> 


), r + 6. Since ^(r, 0) 


together with a remainder made up of terms involving <1^ { — 

V ^ 

is the characteristic function of the aggregate Xi 4- Xa distribution, which has a standard error 

large compared with <0, and | 1 I ^ |’ 

the remainder is negligible whenever Sheppard’s corrections are valid for the univariate moments. 


On the other hand, the terms containing ^ \ 0, — j may not be neglected, since ^{> (0, t) is the 

characteristic function of the aggregate Xi — x^ distribution whose standard error is small com- 
pared with to for high enough autocorrelations. 

We remark that when the autocorrelation is unity, all the higher moments of Xi — jr* are 
zero for the type of distributions usually met with, and 



1 for all 5. Then [Lh — mu 


CO* 

12 


and the correlation between the grouped variates is also unity. This is as it should be, 
for the observations now lie on the principal diagonal line of the square grid formed by the group 
intervals, and the operation of grouping merely rearranges the observations along the line. In 
the more general bivariate case when the correlation is perfect so that the observations lie along 
a line, grouping in general displaces the observations to either side of the line, making the 
correlation between the grouped variates less than unity. 


4. The Normal Case 


In many time series the differences Xi - Xz are found to have a near normal distribution. 
For short lags, this is not necessarily implied by approximate normality in the observations 
themselves. Time series generated by linear processes may often be expected to have normally 
distributed observations through the operation of the central limit theorem, but as the auto- 
correlation approaches unity the distribution of differences depends increasingly on the form of 
the distribution of the original random errors from which the time series was evolved. 

When the differences can be considered normally distributed, we take ^KO, t) ~ 
p being the autocorrelation coefficient and cr the standard deviation of the ungrouped observations 
(whose distribution is itself not nc^ssarily assumed to be normal). The product moment relation 
becomes. 


pa2 = 


00 ^2 

,?i27r52 


(I -»>)«* 


(4.1) 


Writing mt for the variance of the aggregate of all the grouped observations, C — 


1 


(m, - mil) 


and c ~ 



— p), the corrected correlation is obtained as 


p — 1 — c, where c is given in 



248 Daniels — Grouping Corrections for High Autocorrelations [No. 2, 


terms of C in Table I, col. 1, and a* = /w, — ~ in the usual way. The table was computed from 
the formula 

except for small values of c, when it is better to write (4.2) as 


/ c 00 

dc S 

o «= — < 


g— 4»r*A*C 


which transforms to 


c=/",* 2 r. .. a/' + /■■ 

*/ o *= — 00 ^ J o 


dc ^ -»■* 

/ 2 4c 


(4.3) 


using a well known result. The integral is negligible for small c, since it behaves like 4c!e-4r / 
and for values of C up to 0.06 the formula 

c = itC* (4.4) 

is accurate to four places of decimals. (It may be used up to C — 014 with an error in c of within 
1 per cent.) 


5. The Effect of Non-normalily 

From the form of (3.1) it is at once evident that, provided the remainder is negligible, the odd 
moments of .v, - - Xi do not enter into the product moment correction, which is therefore unaffected 
by skewness in the difference distribution. To examine the effect of a moderate degree of kurtosis, 
let us suppose it to be sufficiently well represented by a Gram-Charlier Type A series with terms up 
to the fourth order, whose characteristic function is 

iKO. 0 j^l + *<j*(l - . . . . (5.1) 

ignoring the /“ term for the reason stated. The relation (4.2) between C and c is modified to 

C--C-+ * - S 2 _ (52) 

which for small c reduces to 

^•="cV(i-'24)' (5.3) 

Table I, col. 2, gives c in terms of C for Y 2 = 1. The percentage error in c if normality is assumed 
when in fact y* — 1 is also shown. It is seen to increase to the value of 8-9 per cent, calculated 
from (5.3) as c tends to zero. Errors of this magnitude for Y 2 ^ 1 may therefore be expected in 
1 — p, the quantity usually of interest in the type of problem requiring the present theory. 


A cknowledgment 

I am indebted to Mrs. V. Semple for the computation of Table I. 


References 

^ Cunningham, L. B. C., and Hynd, W. R. B. (1946), “Random Processes in Problems of Air Warfare." 
, /. Roy. Stat. Soc. Suppt., 8, No. 1, pp. 62-85. 

• ^^sber, ^ A. (1921-22), “On the Mathematical Foundations of Theoretical Statistics,** PhiL Trans., A222, 

pp. 309—368. 

• Kendall, M. G., Advanced Theory of Statistics, Vol. I. Charles Griffin & Co., London. 



19471 


DAN1E1.S — Grouping Corrections for High Autocorrelations 


249 


Table I 


For small c\ 
For large C 


^ 1 


c. 


c . 


Per cent . 


Yi “ 0. 

V. - 1. 

dilTereuco . 

•01 


•00031 

•00034 

8-9 

•02 


•00126 

•00137 

8-9 

•03 


•00283 

•00310 

»-9 

•04 


•00503 

•00547 

8-9 

•05 


•00785 

•00855 

8-9 

•06 


•01131 

•01231 

8-9 

•07 


•01539 

•01676 

8-9 

•08 


•02011 

•02189 

8-9 

•09 


•02545 

•02770 

8-8 

•10 


•03141 

•03415 

8-7 

•11 


•03800 

•04118 

8-4 

•12 


•04519 

•04870 

7-8 

•13 


•05294 

•05662 

7-0 

•14 


•06119 

•06487 

60 

•15 


•06988 

•07338 

50 

•16 


•07891 

•08218 

41 

•17 


•08822 

•09106 

3-2 

•18 


•09774 

•10017 

2-5 

•19 


•10740 

•10944 

1-9 

•20 


•11716 

•11884 

1*4 

•21 


•12700 

•12835 

11 

•22 


•13689 

•13797 

0-8 

•23 


•14682 

•14766 

0-6 

•24 


•15677 

• 15742 

0 4 

•25 


•16674 

•16724 

0-3 

•26 


•17671 

•17709 

0-2 

•27 


•18670 

•18699 

01 

•28 


•19669 

•19690 

0-1 

•29 


•20668 

•20684 

01 

•30 


•21668 

•21680 

00 



250 


Armitaoe — Some Sequential Tests of Studenfs Hypothesis 


[No. 2, 


Some Sequential Tests of Student’s Hypothesis 
By P. Armitaoe, B.A. 

(Communication from the National Physical Laboratory) 


1. Introduction 

Sequential analysis has been introduced in America by Wald (1945) and in this country by 
Barnard (1946), and these two papers should be consulted for the principles of the subject. 
Barnard’s paper contains a review of Wald’s work. 

I shall propose here a simple sequential test of Student’s hypothesis, based on Wald’s binomial 
test, and compare it with the ordinary Student’s Mest, and also with a test proposed by Wald. 

It is required to test the hypothesis Hq that the mean (x of a normal population is a certain 
value (Xo, the variance a* being unknown. Dantzig (1940) has shown that any such test based 
on a single sample, whose power function is independent of <j, must have constant power for all 
values of f/, equal to the size of the critical region or significance level. (Such a test, for example, 
would be the trivial one which results in the null hypothesis always being accepted. The power 
is then constant, equal to zero.) This result, however, does not apply to sequential tests, and 
in fact Stein (1945) has produced a two-sample test for Student’s hypothesis having a power 
function independent of a. The average sample size of this test depends upon a, and the test 
may be compared for economy in sampling with tests for the hypothesis that the mean is jxo when 
the variance is known. While tests of the type discussed by Stein are clearly important, I shall 
consider here only sequential tests whose power is a function of ((x — txo)/«T, and so are directly 
comparable with the /-test. 

It is customary to distinguish two different uses of the /-test as a test of The one-sided 
test is sensitive to values of jx on one side only of jxo and consists in rejecting /f© whenever 

. . . . ( 1 ) 

S 


where s are the mean and standard deviation (divisor y/N) of a sample of size N, and / 2 a, a - i 
is taken, for example, from the table of the distribution of / in Fisher and Yates (1943), correspond- 
ing to a probability level of 2a and N - I degrees of freedom. This test has the properties that 


(i) the probability is a of rejecting Hq, when it is true, 

(ii) „ „ „ <3 of accepting //o, when in fact {x — {x© > Do, (D > 0), 


} 


( 2 ) 


where 3 is a function of N, a and D, Neyman and Pearson (1933) have shown that this defines 
a uniformly most powerful test against hypotheses that {x > ^xq- Similarly, the test which rejects 
Hu whenever / > — / 2 a, a -1 has the properties (2) above, with > Do replaced by < — Do, and 
is uniformly most powerful with respect to values of jx < jx©. 

The two-sided test is sensitive to values of jx on both sides of ixo, and consists in rejecting 
Ho when 

I / I > /a.A-i, 


where /a, a -1 corresponds in Fisher and Yates’ Tables to a probability level of a and H — 1 
degrees of freedom. This test has the properties that 

(i) the probability is a of rejecting Ho, when it is true, \ v 

(ii) .. » < 3 of accepting Ho, when in fact | |x ~ ^x® | > Do, j * ’ ' ' 

3 again being a function only of N, a and D. This test is known to possess the optimum property 
of being unbiased of type Bi (Neyman, 1935). 



I947J 


Armitage — Some Sequential Tests of Student's Hypothesis 


251 


2. One-sided Tests 


2.1. A Sequential Test * — A test for Student’s hypothesis to detect one-sided alternatives may 
be derived quite simply from Wald’s probability ratio sequential G^.R.S.) test for the probability p 
in the binomial distribution. 

Denote observations greater and less than pto by + and - respectively. On the hypothesis 
/To that |i = jxo, the probability of an observation being a - is J, while on the hypothesis 
that IX = jxo + this probability is 

00 

Pi— X- f e-i’^dx. 

If, therefore, the observations are taken sequentially and recorded as f or — , a test fulfilling the 
conditions (2) will be the Wald P.R.S. binomial test with a probability of error of the first kind 
equal to p for p — and a probability of error of the second kind equal to a for p — i. (The 
‘•probability of error of the first kind” in this context is not the probability of rejecting, when 
true. Student’s hypothesis, but the probability of rejecting, when true, the smaller value of p.) 

This test is defined approximately by the following procedure (see Barnard’s paper) : 

Start a score at marks. 

For each -f , add on 1 mark, 

„ „ — , subtract b marks. 

Reject Ho when the score reaches 

Accept Ho „ „ „ zero, 


where 


log (1 - P)/a 
log 2(1 - pi)’ 


log 2(1 '-“pij 


and b = 


log 1 / 2pt 
log 2(1 p,)* 


(From the form in which this result is quoted by Barnard, pt has been replaced by i, H and H* 
by ai and so as to avoid confusion with the notation for hypotheses, and a and p interchanged, 
since in our notation a is the probability of rejecting, when true, the higher value of p.) 

Table 1 gives the values of ^2 and b {or D ^ 0-5 (01) 3 0 and three different combinations 
of a and p. (The examples given later in this paper are not restricted to the use of these values 
of a and p.) 


The above test will detect sufficiently large changes of the mean in the positive direction. It 
is clearly applicable for detecting changes in the negative direction. Thus a test fulfilling condi- 
tion (2) (ii), but with > Dq replaced by < — will be defined by the following procedure : 

Start a score at ax marks. 

For each -i , add on b marks, 

„ „ — , subtract 1 mark. 

Reject Ho when the score reaches zero. 

Accept Ho „ » „ Ox Qi, 

ax, and b being the same as before. For simplicity, it will be assumed in the Examples and 
in paragraph 2.2 that in using a one-sided test we wish to detect positive changes of the mean. 


Example 1 

A test is required for which D 1 , a = 0 025, p -- 0- 1 . 

These values of D, a and p give pi — 0*1587, whence a^ — 4*38, b 2*21, Ox + c/a = 11*26, 
The application of the test is much facilitated by the use of integers for Ox, Ot and b* If we add 
9 instead of 1 to the score for a +, we shall replace a*, b and ax -f by 9ai, 9b and 9(ai -f a^), 
which in this case are approximately equal to 39, 20 and 101 respectively. (This makes little 
difference to the validity of the test, especially as it results in a good approximation to b.) 



252 


Armitage — Some Sequential Tests of Student* s Hypothesis 


[No. 2, 


Table 1 




a « 01 , 

/5 « *10 

a - 025 , 

fi - *10 

a - * 05 , ^ 

P » *10 


D . 


**i 



flt 



b . 

0-5 


13*88 

7*07 . 

11*05 

7*02 

8*92 

6*94 

1*49 

0-6 


12*08 

615 

9*62 

6*11 

. 7*76 

6*04 

1*61 

0-7 


10*81 

5*51 . 

8*61 

5*47 

6*95 

5*41 

1*74 

0*8 


9*89 

5 04 . 

7*87 

5*00 

. 6*35 

4*95 

1*89 

0*9 


9*19 

4-68 

7*32 

4*65 

. 5*90 

4*60 

204 

10 


8*65 

4*41 

6*89 

4*38 

5*55 

4*33 . 

2*21 

M 


8-22 

419 

6*55 

4*16 

5*28 

4*11 

2*38 

1*2 


7*88 

402 

6*28 

3*99 

5*06 

3*94 

2*57 

1*3 


7*61 

3-88 

6 06 

3*85 

4*89 

3*81 

2*78 

1*4 


7*39 

3-77 

5*89 

3*74 

4*75 

3*70 . 

2*99 

1-5 


7-21 

3-67 . 

5*74 

3*65 

4*63 

3*61 

3*23 

1-6 


7*07 

3*60 

5*63 

3*58 

4*54 

3*54 

3*47 

1-7 


6 95 

3-54 

5*53 

3*52 

4*46 - 

3*48 . 

3*73 

1-8 


6-85 

3-49 

5*46 

3*47 

4*40 

3*43 

4*01 

1-9 


6*78 

3-45 

5*40 

3*43 

4*35 

3*39 . 

4*30 

20 


6-72 

3*42 

5*35 

3*40 

4*31 

3*36 . 

4*61 

21 


6*67 

3-40 

5*31 

3*37 

4*28 

3*34 

4-94 

2*2 


6*63 

3*38 

5-28 

3*35 

. 4*26 

3*32 . 

5*28 

2-3 


6- 59 

3-36 

5*25 

3*34 

4*24 

3*30 

5*63 

2-4 


6-57 

3-35 . 

5*23 

3*33 

4*22 

3*29 . 

6*00 

2-5 


6*55 

3-34 

5*22 

3*32 

. 4*21 

3*28 . 

6*39 

2*6 


6*54 

3-33 . 

5*21 

3*31 

. 4*20 

3*27 . 

6*79 

2-7 


6*52 

3-32 . 

5*20 

3*30 

. 4*19 

3*26 . 

7*21 

2-8 


6*52 

3-32 . 

5*19 

3*30 

. 4*19 

3*26 . 

7*64 

2*9 


6-51 

3*32 . 

5*18 

3*29 

4*18 

3*26 . 

8 09 

3 0 


6-50 

3-31 . 

5*18 

3*29 

4*18 

3*25 . 

8*55 



Table 

2a 



Table 2b 




M - 0, cr 

■ 10 



n = 

10, O- « 10 


KcndiiiK 


f <»r 

►Score 



Jlcacliiitr 

^ or ~ 

Score 




39 





39 

19 


i' 

48 



7*1 ! 

-I- 

48 

6 1 


-f- 

57 



8*8 

-i- 

57 

- 22-3 


. 

37 



16*9 

-1- 

66 

-31 


. 

17 



- 5*9 


46 

0-2 



26 



3*7 

h 

55 

- 6-6 



6 



6*5 

-h 

64 

6 * 1 


4“ 

15 



17*5 

■+* 

73 

- 11-7 



-5 



22*3 

-f 

82 



— 




13*0 

-f 

91 




Accept //„ 



5*8 

+ 

100 







- 8*7 

-> 

80 







1*0 


89 







9*9 

4~ 

98 







1*4 

“h 

107 


Reject jF/o 

Tables 2a and 2b illustrate the application of the test to observations taken sequentially, at 
random, from normal populations, first having (x -- 0, a = 10, and secondly having u •= 10 
a — 10, the test being based on {x® = 0. * 



1947 ] Armitaoe — Some Sequential Tests of Student's Hypothesis 253 

Thus, in the first case Ho was accepted correctly after a sample of 8, and in the second case 
Ho was rejected correctly after a sample of 14. 

Example 2 

An immediate application is to the problem of testing whether the means of two normal 
populations are equal, the variances being unknown. Student’s /-test may be applied and is 
valid if the variances of the two populations are equal. Otherwise it is fairly safe to use if the 
sample sizes Ni, are equal (= N) (e.g. Hsu, 1938). In this case, however, an exact /-test may 
be obtained, for if the observations, x^ j'i (/ — 1, 2, . . ., N), are independent, then zi ^ (xi - yd 
is distributed normally with mean ((Xi - jxa) and variance f- where (Xi, (Xg, <Ji, are 
respectively the means and the standard deviations of the distributions of x and y. In the sequen- 
tial test, therefore, we take pairs of observations {x, y) from the two populations and treat the 
difference (x ™ y) as the new variable. 

Ho is now the hypothesis that (Xi — (ig — (Xo (in the usual situation where it is required to test 
whether [Xi and ixj are equal, {Xo — 0), and the sequential test gives a probability of a of rejecting 
Ho, when it is true, and a probability ' . ? of accepting Ho when in fact 

((Xi — [Xg) JXo 4- 1- fTg2)J. 

In Tables 3a and 3b are given random samples taken sequentially from two normal popula- 
tions, first having jxi - 0, Oi 10, (Xg - 0, g, ==5; secondly having [x^ 11-2, Oi 10, 
(Xg ^ 0, a-g = 5, where 11-2 = \/(10^ 4 The test is based on [x„ - 0, D — a — 025, 
p — -1, so the same values of Ou and h arc used as in Example 1. 


Table 3^ Table 31) 



Ui =- 0. 0-1 - 

KJ ; Hi — (I, OTg — .”> 



1 1 O’, 

' 10 ; Ma - 

0. On 


X 

V 

r-y + or — 

Scoi\* 

r 

// 

r -fl -f 

or '• 

.Score 




39 

. . . 




39 

2-9 

. 20 . 

0-9 . 

48 

24*3 . 

4-2 

20- 1 . 

( 

48 

90 

. 111. 

-21 . 

28 

7-5 . 

-7-3 

14-8 . 

! 

57 

4-9 

. -2*9 . 

7-8 i- . 

37 

9-7 . 

7*3 

2-4 . 

i 

66 

9-6 

. 0-5 . 

10- 1 . r . 

46 

30-5 . 

0-7 

29-8 . 

F- 

75 

5-2 

. 7-5 . 

2-3 . - . 

26 

5-9 . 

0-9 

50 . 

f 

84 

-13-3 

. 6-5 . 

^ 6-8 . - . 

6 

26-9 . 

-1'2 

28-1 . 


93 

-5-7 

. 8-8 . 

-14-5 . - . 

-14 

15-7 . 

■ 3’7 . 

19-4 . 

-f 

102 


Accept Ho Reject Ho 

Thus, in the first case Ho was accepted correctly after a sample of 7, and in the second case 
Ho was rejected correctly after a sample of 7. 

The power of this method is discussed at the end of paragraph 2.2. 

2.2. Economy of the test as compared with Student's i-test . — Denote by //-^ the hypothesis 
that (X -- (Xo 4- Do, The one-sided /-test defined by (1) has a probability a of rejecting Ho when 
true, and a probability p of accepting Ho when H is true, where 

1 — p = P { / > /2a, .V -1 1 H \} 

^ p\^ N~\{xh - ,x„/a - D) -\- VN - i D ^ ^ J 

1 • s/<y - \ 

If H+ is true, VN{xla — (Xu/a — D) and Ns^/{N — 1)<t* are distributed respectively] as a 
normal deviate with zero mean and unit variance, and as xV(^ - 1) with (N — 1) degrees of 
freedom, and the quantity 

Vn — \ + Vtv — 1 D 

sjo 

is therefore distributed in the non-central /-distribution (cf. Johnson and Welch, 1940). 



254 Armitaob — Some Sequential Tests of Student's Hypothesis {No. 2, 

• Using the notation for this distribution adopted in this last-quoted paper we have _ 

1 - p = /► (iV- 1, y/ND, tax, Jf-i). 
and the value of D for which p takes a certain value is given by 

\/ND = S (AT-l. tax, jf-i, 1-P) 

= -8(Ar--l. -r 2 x,ir-i. fi) . . . . (3a) 

From the tables of S(J\r - 1, - tax. n-i, 3) given by Johnson and Welch, and from some 
tables by Neyman and Tokarska (1936), the graphs shown in Fig. 1 were drawn. For various 




1947 ] A^MTCAOESome Sequential Tests of Studenfs Hypothesis. 255 

combinations of a and p likely to be used, the graphs show the relation between D and N. (Strictly, 
the relation is defined only for integral values of iV, but for purposes of comparison we shall use 
(3a) to define the relation between D and non-integral sample sizes.) 

The sequential test suggested in paragraph 2.1 is clearly equivalent to this test in that it has 
the same probabilities of error of the two kinds, a and p, with respect to the same hypotheses 
Ho and H^. Unlike Student’s test, however, the sequential test has a variable sample size, /i, 
and on each of the hypotheses Ho and n will be distributed with means Eo{n) and E^{n) 
respectively. If we show that, say, Eo{n) < N for some values of a, p and D, we may say that 
when Ho is true the sequential test is, in the long run, the more economical. 

Now Eoin) and E^{n) are given approximately by the following formulae (see Wald) : 

E ^ ( 1 - «) lo g (1 - « )/P - « log (1 - P)/a __ (1 - a)<7a - ... 

^ log 1 /4 a (I - a)" ' ~ i \b - 1) ‘ ‘ 

E^(n\ - (1 - P)/« “ P log (1 - «)/P (1 - P)«i - ... 

(1 - p,) fog 2(1 - A) - Pi log Tflp, - 1 (r+ 1) • 

The graphs of Eo(n) and £+(«) as functions of D are drawn in Fig. 1. It will be seen that for 
given a, p and Z), E^in) is always considerably less than N, and that E ^ (w) and N are nearly equal, 
E^{n) being the smaller for D < Z>o, where Do varies between 1*7 and 2 0. 

It must be remembered that if Ou at and b are defined as in paragraph 2. 1 in terms of D, a 
and p, then a test based on them will not have exactly the required properties, nor will Eo{n) and 
E^(n) be given exactly by the above equations, so the graphs of Fig. I are not strictly accurate. 
Investigations based on the exact solutions of Burman (1946) for a Wald P.R.S. binomial test, 
for integral Ou a^ and b, suggest that for given Dy ol and p, the test will produce an error of the 
first kind very nearly equal to a, and an error of the second kind less than p, and is consequently 
equivalent to a Student’s test of the same significance level as expected, but with a greater sample 
size. Eo(n) is given quite accurately by (4), but E^(n) is greater than (5). The graphs shown in 
Fig. 1 are probably substantially correct. 

In practice (jl will usually take values other than (Xo and [Xo + Da. The curve of the average 
sample size E(n) as a function of [i has a maximum between these two values, £(/?) being less than 
Eo(n) for (x < (Xo and less than E+(n) for (x > (Xo H- Da. 

Fig. 1 may also be used to describe the relation between the sequential test used in Example 2, 
with the variate z ^ (x — y), and the corresponding fixed sample size test using z, which we 
shall call the difference test. Now the difference test is not the most powerful with respect to 
values of (Xi ~ jx* greater than (Xo. When a^ = a ~ a, the Student’s test based on the criterion 

{X - y) 1 

\/ H" 

where Jc, y, Si and are respectively the means and standard deviations in the two samples of 
size Ny is unifoimly most powerful for these alternatives. We shall call this Student's test, 
although the difference test also uses the /-distribution. Now, the denominator of the difference 
test is the sample standard deviation of z, which is distributed as a multiple of x with N—\ degrees 
of freedom, while that of Student’s test has 2N - 2 degrees of freedom, and in fact we can say 
that by equation (3a) the Student’s test gives 

^/N Z>, = - 5 (2Ar - 2, - /2a. 2Ar-2, W, 
while the difference test gives 

V/V/>2 = - a (AT- 1, - /2a,iv-i. P), 


where p is the probability of accepting Ho when (Xj — (Xg ^ (Xo -f \^2Dia (/ = 1, 2). For large 
N these two expressions tend to equality, and for fairly small N their difference is small. For 
example, for Ar= 10, a = 025, p == 1, we have Di = 108 and Dt M5. Fig. 1, then, gives 
a fairly good idea, at any rate for cti = a,, of the relation between the sequential test for two 
populations and the current Student’s test. 



256 


Armitagh — Some Sequential Tests of Student's Hypothesis 


[No. 2, 


3. Two-sided Tests 

3.1. Extension of the sequential tor.— The test defined in paragraph 2. 1 may be extended to 
the problem of detecting departures of \x from [Lo in both directions. Let us denote by the 
one-sided P.R.S. test for positive changes of \x, as given in paragraph 2. 1, but with a replaced by 
a/2, and by T- the corresponding test for negative changes of tx, again with a replaced by a/2, 
//o, H+ and denote the hypothesis that |x - (Xo, ix« + Do and jxo - Do, respectively. T+ 
will then accept either or //+, and r_ either or //_. If observations are drawn sequentially 
from the population and subjected to both T^ and T— until results are obtained from both tests, 
we can have the following combinations of results : 



Teat T_^ 

Ti st T _ 

(I) 

Accept H„ 

Accept /fft 

(H) 

//+ 

„ H, 

(III) 




We now formulate the following test: 

For (I), accept //„ 

„ (ID, „ lU 
„ (III), „ //- 

Fig. 2 illustrates the position on the lattice diagram (Barnard's “inspection diagram"). A + is 
represented by a move of 1 unit parallel to Ojt, and a - by 1 unit parallel to Oy. 



Fio. 2.— Lattice diagram representation of the two-sided sequential test of paragraph 3.1. 


The regions are symmetrically placed, OA = ajb, OB a ^ and the slopes of BV and AW eixt 
respectively \jb and b, (The boundaries are strictly not straight lines, but in the form of steps, 
as may easily be seen from the test procedure of paragraph 2.1, and the exact dimensions are 
given by Stockman and Armitagc (1946).) 

The fourth alternative, to accept //+ under T+ and under T^, is clearly impossible if 
OA < OB, and this follows from the definitions of <?,, Oz and b, provided that a < p. 

The final decision is made when a path reaches a shaded region, with the exception that if 
both CD and AD are crossed (as shown by the dotted line), we accept immediately, instead of 
waiting till the path reaches one of the shaded regions. 

When is true, the probabilities of (11) and (III) are each a/2; when is true, the proba- 



19471 


257 


Armttaoe — Some Sequential Tests of Studenfs Hypothesis 

bility of (II) is 1 - p; and when is true, the probability of (III) is I - p. Consequcntty, 
the test has the following properties : 

(i) the probability is a of rejecting /fo, when it is true, 

(ii) • » < P of rejecting //+, when > Dn, ( . . . ( 6 ) 

(^d) ,, ,, ,, ^ » //— , ,, (X ^ Do, ) 

3.2. Comparison with the two-sided Studenfs test, — Suppose that, in applying the two-sided 
Student’s test, we accept H+ when / > /a. a and accept //_ when / < - t^^ ^ i ; then the 
test satisfies the conditions ( 6 ), and may be directly compared for economy with the sequential 
test of paragraph 31 . 

It will be noticed that the two sets of conditions (3) and ( 6 ) differ slightly in form. In ( 6 ), 
the probability of rejecting when true, is equal to the probability of accepting //„ plus the 
probability of accepting when //.}- is true, and so the p of (6) is greater than the p of (3). 
In fact, denoting these two values by Pz and Pt respectively, we have (paragraph 2.2): 

p., r^l - P(N \.y/ND,t^ s i) 

p, 1, ^//VD, - A i) PiN I, y/NDs V i), * 

1 — px being the “power” of the test with respect to /f+. Since P{N - I, ND, - t^^ 
will in all practical cases be very nearly equal to I (i.e. H- is very unlikely to be accepted when 
//+ is true), Pi and pz will differ very slightly, and as the formula for px does not admit of a 
tabulated inversion analogous to (3 a), we shall use p — pa. The graphs of Fig. I are thus applic- 
able for obtaining N in terms of /), a and p if for a significance level of a we read off the graph 
corresponding to a/ 2 . 

Now the sample size of the sequential test of paragraph 3 . 1 is distributed as the maximum 
of a sample of 2 (non-independent) observations from a population having the same distribution 
as the sample size in the one-sided case, when //© is true; and as the maximum of two (non- 
independent) observations taken from different populations, when or H- is true. (When 
is true, however, will almost always accept f/o quickly, and the sample size will be that 
required for to accept or reject //+, which may be found from Fig. 1 ; similarly when H- 
is true.) 

The sequential test with a ^ -05, p I, Z? — 1 was repeated 60 times with samples from a 
normal population, when //« was true, and the distribution of sample size had mean 13-88 and 
variance 56*20. For //+, the average sample size as given by Fig. 1, is about 11-7. The 

sample size of the corresponding r-test is about 12-6. As a second example, the test with a -05, 
p = -05, D — 2 was repeated 60 times with samples from a normal population with //o true, 
and the distribution of sample size had mean 5-15 and variance 5 03. For // ^ , the average sample 
size Ej^{n\ as given by Fig, 1, is about 5*6. The sample si/e of the corresponding /-test is also 
about 5*6. 

3.3. Another binomial test, — As an alternative to the test of paragraph 3.1, Barnard suggests 
the following: 

Observations are taken sequentially, and in pairs. The combinations -f— or — + are 

denoted by 7 , and -b 4 - or by ;c. Under Ho the probability of a y is i, while under H^. or 

/f- the probability of a y is 

-^2pii\ -Pih 

where, as before, 

'’■"vk/'’"*- 

This test does not distinguish between //+ and /f«, and its properties may be put in the form (3), 
v/h&ce P « Pi (see paragraph 3.2). In practice, of course, if Ho was rejected it would be quite 

SUPP. VOL. DC. NO. 2. s 



258 


Armitage — Some Sequential Tests of Students Hypothesis [No. 2, 

easy to see whether this was due to a preponderance of +4- or , and th^ whether or 

was true. The sample size is twice that required for a P.R.S. test to distinguish between 
2 / 7 id ~/ 7 i) and with errors of the first and second kinds p and a respectively; while in the 
test of paragraph 3.1, the sample size is the maximum of two (non-independent) sample sizes 
for the P.R.S. test to distinguish between pi and i, with errors of the first and second kinds ^ 
and Ja respectively. It is not obvious how these two tests compare in general. 

For the first example considered in paragraph 3.2, the test suggested by Barnard is very 
uneconomical, having EJin) •- 32*6 and E \ {n) ^ 42*0. 


3.4. Wald's non-linear test. 

3.4. 1. Wald (1945, pp. 183-186) proposes a P.R.S. test satisfying conditions (3) which is not 
linear, i.e. does not reduce to a linear scoring procedure. 

If the observations in a sample of n are Xin (/ -= 1, 2, . . ., //), let 





the summations being over / 1,2,...,//. (Wald’s is replaced by Z>, and the mean [Xo has 

been introduced instead of zero without any loss of generality.) At the //th stage. 


Accept //„ if X p/(l — *)• 

Reject //o if A' • (1 - p)/a. 

Take a further observation if p/(l — a) < A'*.: (1 — p)/a. 


Wald shows that the power of the test is a monotonically increasing function of | ((x - (Xo)/a | , 
and that A' is a strictly increasing function of z*, where r. ^{x — (Xo)/\/2^(jc — [Xo)® (the suffices 
having been dropped), without actually evaluating X. 

The integrals in (7) are evaluated in paragraph 3.4.2, and expressions for X given in terms of 
n and /), so that if U„ are solutions for z® of the equations 

A' p/(l - a) 

and A" ^ ( I -- p)/a, respectively, 
the procedure at the nih stage becomes: 

Accept H(t if 
Reject Ho if z® ^ Un. 

Take a further observation if Ln < z® < Un- 

Values of Ln and Un for // - 2, 3, . . 30, and for various combinations of a, p and D are given 

in Table 4. 

3.4.2. The first term of (7) 





(Xo)3 -- 2Dci^{x 
2a® 


!Xo) + 


] 


da 




( 8 ) 



1947] Armitage — Some Sequential Tests of Studenfs Hypothesis 259 

In the numerator of (8), put v -= VS(x ~ - D I,(x - and in the 

denominator, put y — S(jc — Then (8) becomes 

'JO 

exp {z^ — Ai)J J (,v + Dzy^-'^e ^y^dy 


Similarly, putting y ^ VsCv -- [loy/a 4 - DS(.y — (jlo)/\/-(.v - (Xo)^ in the numerator, the second 
term of (7) becomes 


exp 1^^ (z* «)J (r - Dzy-'^ e-'^y^ 

2 “='r(“ri) 


dv 


. * . Xrr . \ . 


- “i] 


where I = j {y + DzY~' 2 ^ - iv* ^(y / ^y 

-Dz Oz 


= J i (>> - DzY-^ I e - 43'* dy t- J\(y ' ^ I 


Dz 


dy 


00 

=/ 


{y - Dz) 




e- ly* dy. 


1 1 V'ln is the (n — 2)-th absolute moment about the point Dz of the normal distribution with 
zero mean and unit variance. 

For even w, 


00 «— 2 

1 = J 2 ( - 0^ 

- f» j 0 


(Dz)> e -^y^dy 


s/lv. 2 (« - 3 - 2y) (n-5-2j) . . . 3.1. "-“Q. (DzYK 
y-o 

^ exp (z* - «)J 

^2k (/» — 3) (« — 5) . . . 3.1 


= exp 




(DzY> 

(« - 3“) (n-5) . .“ . \n~-2i-iy 


(9) 



260 


Armitaoe— S ome Sequential Tests of Student's Hypothesis 


[No. 2, 


For odd 


exp ^ y (e* - '')J 


where 


00 J)z n — 2 


/ (Dz)i e-iy*dy 

Dz “00 j =*=0 

= 2 / '^n~tc^^Oz)^,yn-ii-ie-iy*dy+2j ^ «-<^C^+i(Dz)^)+^t y"' 

j..ll 0 J..0 

w — 2j — .> 

^ 2 

= 2 exp ^ {(02)«-s + ^ (/)z)*<-' ^*)(«-2y•-3)(n-2y-5) . . . (2A:+2)} 


rt-2-8 ^-4y* 


n -6 
2 


n“2i-7 
2 


2 2 n 

2 {(^>2)"-” + 2 (ZJc)«J + *-+»(«-2y- 4)(n-2y- 6) . . . (2A:+3)}J 


/-() 


A- 0 


M — 5 
2 




( 10 ) 


+ 2 I (Dz) |^(Dz)«-* 1- 2 1 1 {n-2j-4) (n~2j-6) . 

;~0 
J)z 

where KOz) — J' e-^^'^dx. 

From (9) and (10) we find, for example, that 
when /I — 2, exp (z^ — 2) J 

„ « - 3, A' = exp(- ^1+ {Dz) . I(Dz) . exp 

„ II --- 4, X exp (z'* - 4)J . {I j- (Dz)®} 

„ 11^5, X~ i exp( - [{(Dz)® I 2} + {(Dz)» + 3(Dz)} . I(Dz) . exp ^ 


and so on. 


Table 4 gives values of Un for four different combinations of a, p and D, and for n < 30. 

For w 1, z* is always equal to unity, so no values of L„, are defined. For « > 1, z* lies 
between 0 and n ; consequently when the solutions of the equations gave < 0 or Un > n these 
values were omitted from the tables. The results for odd values of ai > 13 were obtained by, 
interpolation from those for even values. Each reading is correct to within one unit in the last 
decimal place. 



1947] Armitage — S'oiwe Sequential Tests of Students Hypothesis 261 


Table 4 




a - * 05 , ^ 

« *05 



a - * 02 , ^ 

- *06 




1 

~ 

D « 

2 

D 

1 


D 


n 










2 . 

. . 

• m • 

0*528 

• . • 

, , 



0*512 


3 . 


• • • 

1*052 

• • • 

, , 



1*040 


4 . 

, , 

• • • 

1*543 

. . . 




1*531 


5 . 

. , 

« • • 

2*033 

4*466 . 




2*021 

4*866 

6 . 

0023 

5*632 . 

2*523 

4*908 . 

0*010 



2 511 

5-299 

7 .• 

0*215 

5*610 . 

3*013 

5*364 . 

0*201 

6*792 


3*002 

5*748 

8 . 

0*398 

5*643 . 

3*504 

5*829 . 

0*383 

6*771 


3*492 

6*208 

9 . 

0*579 

5*712 . 

3*995 

6*300 . 

0*564 

6*795 


3*983 

6*674 

10 . 

0*763 

5*806 . 

4*485 

6*775 . 

0*746 

6*853 


4*474 

7*145 

11 . 

0*948 

5*919 . 

4*976 

7*253 . 

0*931 

6*935 


4*965 

7*620 

12 . 

1*136 

6*047 . 

5*467 

7*733 . 

1*118 

7*036 


5*456 

8*098 

13 . 

1*325 

6*185 . 

5*958 

8*215 . 

1*307 

7*151 


5*947 

8*577 

14 . 

1*516 

6*332 . 

6*449 

8*698 . 

1*498 

7*278 


6*438 

9*058 

15 . 

1*708 

6*486 . 

6*940 

9*182 . 

1*690 

7*414 


6*929 

9*541 

16 . 

1*902 

6*646 . 

7*431 

9*667 . 

1*883 

7*558 


7*420 

10*025 

17 . 

2*096 

6*810 . 

7*922 

10*153 . 

2*077 

7*708 


7*911 

10*509 

18 . 

2*291 

6*978 . 

8*413 

10*639 . 

2*271 

7*864 


8*402 

10*994 

19 . 

2*487 

7*149 . 

8*904 

11*126 . 

2*466 

8*024 


8*893 

11*480 

20 . 

2*682 

7*323 . 

9*395 

11*613 . 

2*662 

8*188 


9*384 

11*966 

21 . 

2*878 

7*499 . 

9*886 

12*101 . 

2*858 

8*355 


9*875 

12*453 

22 . 

3*074 

7*678 . 

10*377 

12*589 . 

3*054 

8*525 


10*366 

12*940 

23 . 

3*271 

7*858 . 

10*868 

13*077 . 

3*251 

8*697 


10*857 

13*427 

24 . 

3*468 

8*040 . 

11*359 

13*565 . 

3*448 

8*872 


11*348 

13*915 

25 . 

3*666 

8*223 . 

11*850 

14*054 . 

3*646 

9*049 


11*839 

14*403 

26 . 

3*863 

8*408 

12*341 

14*543 . 

3*843 

9*227 


12*330 

14*892 

27 . 

4*061 

8*594 . 

12*832 

15*032 . 

4*041 

9*407 


12*821 

15*380 

28 . 

4*258 

8*780 . 

13*323 

15*521 . 

4*238 

9*588 


13*312 

15*869 

29 . 

4*456 

8*967 . 

13*814 

16*010 . 

4*436 

9*771 


13*803 

16*358 

30 . 

4*654 

9*155 . 

14*305 

16*500 . 

4 633 

9*954 


14*294 

16*847 

Fig. 

3 illustrates these results for a 

- P = * 05 , 

- 2 . 

The boundaries arc 

obtained by 


plotting Lny Un against n. (Since for each test Ln and Un are very nearly linear functions of n 
except for small n, it is probably fairly safe to extrapolate linearly for a good way past n - 30.) 
Any sample is represented by a zig-zag line formed by plotting for each value of //, and three 
such “sample paths” are shown, the samples being taken at random from populations having 
(i) {JL = 0, (ii) {X = o and (iii) [i == 2a. In the first case. Ho is accepted at /i ^ 3 ; in the second, 
Ho is accepted at /i 2; and in the third. Ho is rejected at « = 7. 

The test was applied 60 times to samples from each of the populations (i) and (iii). For (i), 
the distribution of sample size had mean 3-57 and variance 2*81, and Ho was never rejected. 
For (iii), the distribution of sample size had mean 6-93 and variance 8-66, and Ho was on one 
occasion accepted. The sample size for the corresponding Student’s test is about 5*6. The 
random sampling numbers used were grouped in intervals of O la. These results may be com- 
pared with those obtained by applying the two-sided binomial test to the same populations, as 
given in paragraph 3.2. Wald’s test appears to be the more economical where Ho is true, and 
the less when H^ is true. 

The low number of wrong decisions obtained in these experiments suggests that the test my6 
be “stronger” than is stated. (The probabilities of obtaining 0 and 1 wrong decisions out of aO 
when the expected number is 3, are respectively 046 and *145.) If so, the test should be compared 
with a Student’s test of sample size greater than 5*6. 



262 


Armitage — Some Sequential Tests of Student's Hypothesis 


[No. 2, 



3.4.3. The theoretical basis of the test of paragraph 3.4. 1 is given by Wald, and need not 
be stated fully here. It is worth pointing out, however, that many tests satisfying the required 
conditions may theoretically be obtained by Wald's method. Referring to p. 183 of his paper, 
it is clear that the weight functions v„y(a) and are arbitrary in so far as they may be replaced 
by any others which result in a test satisfying the required conditions. The choice of such 
functions would in practice be conditioned also by the mathematical flexibility of the formula 
for X, Care must be taken that any limiting process, similar to Wald's ‘V co”, is valid. For 
example, 1 am indebted to Professor Wald for the observation that the weight functions defined by 




I 0 


l)r^ - i/a*, r r.a -r^oo, (x — 0, 
, otherwise 


It' r- (TCGf , (X— 1 Dry, 

I 0 , otherwise 


(where k > 1, and /* is a positive number which later tends to zero), are invalid in the sense that 
the probabilities of error of the two kinds, for the test based on them with nominal a and p, arc 
not actually a and (i, although X may be evaluated on much the same lines as before. For, if 
ar(rj) is the error of the first kind for any r > 0 and any a > 0, the concentration of Wor{o) in the 
neighbourhood of ry ■ 0 as /* >0 makes the convergence of a/a) non-uniform in ct. It therefore 
docs not follow from the facts that 



w^,r (cr) da = 


a, 


and that lint is a constant, that this constant is a. No such difficulty seems to be encountered 
with Wald’s weight functions. 



1947 ] . 


Armuage — Some Sequential Tests of Student's Hypothesis 


263 


Conclusions 

In this paper 1 have investigated several questions which arose immediately one looked for a 
sequential alternative to Student’s /-test, and I have not tried to examine the subject fully. For 
example, the approach is largely that used by Wald, of considering two or more alternative 
hypotheses, and although 1 have shown that when these hypotheses are true the sequential method 
is often the more economical, 1 have not suggested how the practical experimenter should decide 
which sequential test, if any, should replace the /-test he is in the habit of using. Both the theo- 
retical and the practical aspects of the subject suggest that much research is yet to be done. 


Acknowledgments 

This paper is an expanded version of a report, QC/R/20, issued by the Ministry of Supply 
Advisory Service on Statistical Method and Quality Control (S.R. 17). Work was continued 
later as part of the research programme of the National Physical Laboratory, and this paper is 
published by permission of the Director of the Laboratory. 

The author desires to acknowledge the assistance rendered by Mr. .1. P. Burman, who is 
responsible for much of the algebra of paragraph 3.4.2, Mr. D. V. Lindley, who prepared 
the diagrams, and by Mr. F. J. Anscombe, who read the manuscript, and suggested some altera- 
tions which have since been made. 


References 

Neyman, J., and Pearson, E. S. (1933), “On the problem of the most efficient tests of statistical hypotheses,” 
Phil. Trans. Roy. Soc.. a, 231, 289-337. 

Neyman, J. (1935), “Sur la verification des hypotheses siatistiques composees,” Math, de France, 63, 246-266. 
Neyman, J., and Tokarska, B. (1936), “Errors of the second kind in testing ‘Student’s’ hypothesis,” J. 
Amer. Stat. Ass., 31, 318-326. 

Hsu, P. L. (1938), “Contribution to the theory of ‘Student’s’ t-test as applied to the problem of two samples,” 
Stat. Res. Mem., 2, 1-24. 

Dantzig, G. B. (1940), “On the non-existence of tests of ‘Student’s’ hypothesis having power functions 
independent of a,” Ann. Math. Stat., 11, 186-192. 

Johnson, N. L., and Welch, B, L. (1940), “Applications of the non-central /-distribution,” Biometrika, 31, 
362-389. 

Fisher, R. A., and Yates, h. (1943), Statistical Tables for BiologUvf Agricultural and Medical Re.search. 
Oliver & Boyd. 

Wald, A. (1945), “Sequential tests of statistical hypotheses,” Ann. Math. Stat., 16, 117-186. 

Stein, C. (1945), “A two-sample test for a linear hypothesis whoso power is independent of the variance,” 
ibid., 16, 243-258. 

Barnard, G. A. (1946), “Sequential tests in industrial statistics,” Supp. Journ. Roy. Statist. Soc., 8, 1-21. 
Burman, J. P. (1946), “Sequential sampling formulae for a binomial population,” /6/V., 8, 

Stockman, C. M., and Armitage, P. (1946), “Some properties of closed sequential schemes,” ibid., 8, 104-1 12. 




265 


INDEX 

TO THl SuPPLEMhNT TO THE JOURNAL OF THE ROY^E STATISTICAL SOC IETY 

VoL. IX, 1947 


PAGE 

Anscombl (F. J.), Godwin (H. J.) and Plac kfit (R. L.). Methods of Deferred Sentencing in 

Testing the Fraction Defective of a Continuous Output. ...... 198 

Armitage (P.). Some Sequential Tests of Student's Hypothesis . 250 

Bartleit (M.. S.). MuitivarLte Analysis ......... 176-190 

MiiltKariato ajial>8is of variniu*o . . ........ 17a 

Ciuioniral uMliu-tlun of the tfnjoral r«'vin‘''sioii )>iobl«*in . . . 178 

Tlu‘or> of (liHcriiiiinaiit luiictioiiH ............ ISl 

Discussion of an cKainpl<* from antiiropomctiy .......... 1S2 

(foncral saiaplin^ (listrifiiUioii of flic canonical loots . . ..... ISO 

IUl)liomapiiy . IHO 

Discussion: Dr. Geary : Sir C. Burt : the Chairman ; Mr. C. R. Rao ; Dr. Herdan ; Dr. C. A. B. 

Smith; Mr. Moroney; Dr. Bartlett in reply ...... y . 190-197 

Daniels (H. E.). Grouping Corrections for High Autocorrelations 245 

Exhibition of Mechanical Aids to Statistical Computation . ...... 140 


Factor Analysis of a Matrix of 2 x 2 Tables. See Si ater (Patrick). 

Pactorial Experiments Derivable from Combinatorial Arrangement of Arrays. See Kao (C. 
Radhakrishna). 

Finney (D. J.). The Principles of Biological Assay ........ 

U(‘1ationHhjp of doH* and rchpoii'^c . . ........ 

Hist imal ion ot jclatiM' jiotcncy ..... ....... 

C^uantitatlM' rc‘«ponsfH .... . . . . . . 

()uantal I’csponsi's ............ 

Hac'tciial assa>s fioni dilution .'♦dies . . , ....... 

ilcsij.'fft of ass;»\s .............. 

Siiniinaiv . . ........ ... 

\f»])cudi\ . ............. 

Discussion: Mr. Fieller; Dr. Irwin; Sir P. Hartley; Mr.Gridgcman ; Dr. Trevan; Dr. Emmens; 
Mr. Bacharach- Dr. Davies; Dr. Wood; Mr. Finney in reply . . . . . 

The Significance of Associations in a Square Point Lattice ..... 

Godwin (H. J.). Methods of Deferred Sentencing. See Anscombe (F. J.), Godwin (H. J.) 
and Plackett (R. L.). 

Grouping Corrections for High Autocorrelations. See DANiEiiJ (H. E.). 


46 81 


47 

50 

52 

58 

07 

00 

74 

77 


81-91 

99 


Intfrdependence of Blocks of Transiictions, On the. See SroNh (Ric hard). 


Lindley (D. V.). Regression Lines and the Linear Functional Relationship . . . 218 

Methods of Deferred Sentencing in Testing the Fraction Defective of a Continuous Output. See 
Anscombe (F. j.), Godwin (H. J.) and Plackett, (R. L.). 

Moran (P. A. P.). The Random Division of an Interval ...... 92 

Multivariate Analysis. See Barileti (M. S.). 


Osc illatory Properties of the Moving Average. See Spencer-Smith (J. L.). 

Plackett (R. L.). Methods of Deferred Sentencing. See Anscombe (F. J.), Godwin (H. J.) 
and Plackett (R. L.). 

Principles of Biological Assay. See Finney (D. J.), 

SUPP. VOL. IX. NO^ 2. T 



266 


Index to Supplement Vol. JX^ 1947 
Random Division of an Interval. See Moran (P. A. P.). 

Rao (C. Radhakrishna). Factorial Experiments Derivable from Combinatorial Arrange- 
ments of Arrays . . . . . I ? 

R£gr:ession Lines and the Linear Functional Relationship. See Lindley (D. V.). 

Sequential Tests of Student’s Hypothesis, See Armitage (P.). 

Significance dA>Associations in a Square Point Lattice. See Finney (D. J.). 

Slater (Patrick). The Factor Analysis of a Matrix of 2 a 2 Tables .... 
Spencer-Smith (J. L.). The Oscillatory Properties of the Moving Average .... 
Statistical Investigation of Casualties Suffered by Certain Types of Vessels. See Vajda (S.). 

Stone (Richard). On the Interdependence, of Blocks of Transactions .... 

The data 

ConrhiHioii.s and #ni*«tfoHtIons ............. 

Discussion: Mr. Champernowne ; Mr. Babington Smith; Sir C. Burt; Dr. Barna; Prof. Allen; 

Dr. Geary; Mr. Cohen; Miss Carruthers ; Mr. Stone in reply ..... 


Vajda (S.). Statistical Investigation of Casualties Suffered by Certain Types of Vessels . 14 

Discussion: Mr. Maddex; Mr. Chambers; Sir W. Elderton; Mr. Buckland: Mr. R. E. Beard; 

Prof. Greenwood; Dr. Heron; Dr. Sutton; Dr. Solomon; Mr. Babington Smith; Mr. 
Qucnouille; Dr. Vajda in reply. .......... Ifv 





1. A. B. L 75. 


*'iri!jSlAN'';«^MO0LTURAL RESEARCH 
INSTITUTE LIBRARY, 

NEW DELHI. 



MGIPO— 85— AR/54— 7-7-54— 7,000. 







